An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism
Eck, Douglas, Bengio, Yoshua, Courville, Aaron C.
The Indian Buffet Process is a Bayesian nonparametric approach that models objects as arising from an infinite number of latent factors. Here we extend the latent factor model framework to two or more unbounded layers of latent factors. From a generative perspective, each layer defines a conditional \emph{factorial} prior distribution over the binary latent variables of the layer below via a noisy-or mechanism. We explore the properties of the model with two empirical studies, one digit recognition task and one music tag data experiment.
Polynomial Semantic Indexing
Bai, Bing, Weston, Jason, Grangier, David, Collobert, Ronan, Sadamasa, Kunihiko, Qi, Yanjun, Cortes, Corinna, Mohri, Mehryar
We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Dealing with polynomial models on word features is computationally challenging. We propose a low rank (but diagonal preserving) representation of our polynomial models to induce feasible memory and computation requirements. We provide an empirical study on retrieval tasks based on Wikipedia documents, where we obtain state-of-the-art performance while providing realistically scalable methods.
Integrating Locally Learned Causal Structures with Overlapping Variables
Danks, David, Glymour, Clark, Tillman, Robert E.
In many domains, data are distributed among datasets that share only some variables; otherrecorded variables may occur in only one dataset. While there are asymptotically correct, informative algorithms for discovering causal relationships froma single dataset, even with missing values and hidden variables, there have been no such reliable procedures for distributed data with overlapping variables. Wepresent a novel, asymptotically correct procedure that discovers a minimal equivalence class of causal DAG structures using local independence information fromdistributed data of this form and evaluate its performance using synthetic and real-world data against causal discovery algorithms for single datasets and applying Structural EM, a heuristic DAG structure learning procedure for data with missing values, to the concatenated data.
On Bootstrapping the ROC Curve
Bertail, Patrice, Clรฉmenรงcon, Stรฉphan J., Vayatis, Nicolas
This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the smoothed bootstrap" is introduced. Theoretical arguments and simulation results are presented to show that the "smoothed bootstrap" is preferable to a "naive" bootstrap in order to construct accurate confidence bands."
Sequential effects reflect parallel learning of multiple environmental regularities
Wilder, Matthew, Jones, Matt, Mozer, Michael C.
Across a wide range of cognitive tasks, recent experience in๏ฌuences behavior. For example, when individuals repeatedly perform a simple two-alternative forced-choice task (2AFC), response latencies vary dramatically based on the immediately preceding trial sequence. These sequential effects have been interpreted as adaptation to the statistical structure of an uncertain, changing environment (e.g. Jones & Sieck, 2003; Mozer, Kinoshita, & Shettel, 2007; Yu & Cohen, 2008). The Dynamic Belief Model (DBM) (Yu & Cohen, 2008) explains sequential effects in 2AFC tasks as a rational consequence of a dynamic internal representation that tracks second-order statistics of the trial sequence (repetition rates) and predicts whether the upcoming trial will be a repetition or an alternation of the previous trial. Experimental results suggest that ๏ฌrst-order statistics (base rates) also in๏ฌuence sequential effects. We propose a model that learns both ๏ฌrst- and second-order sequence properties, each according to the basic principles of the DBM but under a uni๏ฌed inferential framework. This model, the Dynamic Belief Mixture Model (DBM2), obtains precise, parsimonious ๏ฌts to data. Furthermore, the model predicts dissociations in behavioral (Maloney, Dal Martello, Sahm, & Spillmann, 2005) and electrophysiological studies (Jentzsch & Sommer, 2002), supporting the psychological and neurobiological reality of its two components.
Efficient Direct Density Ratio Estimation for Non-stationarity Adaptation and Outlier Detection
Kanamori, Takafumi, Hido, Shohei, Sugiyama, Masashi
We address the problem of estimating the ratio of two probability density functions (a.k.a.~the importance). The importance values can be used for various succeeding tasks such as non-stationarity adaptation or outlier detection. In this paper, we propose a new importance estimation method that has a closed-form solution; the leave-one-out cross-validation score can also be computed analytically. Therefore, the proposed method is computationally very efficient and numerically stable. We also elucidate theoretical properties of the proposed method such as the convergence rate and approximation error bound. Numerical experiments show that the proposed method is comparable to the best existing method in accuracy, while it is computationally more efficient than competing approaches.
Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data
Huang, Shuai, Li, Jing, Sun, Liang, Liu, Jun, Wu, Teresa, Chen, Kewei, Fleisher, Adam, Reiman, Eric, Ye, Jieping
Recent advances in neuroimaging techniques provide great potentials for effective diagnosis of Alzheimer's disease (AD), the most common form of dementia. Previous studies have shown that AD is closely related to the alternation in the functional brain network, i.e., the functional connectivity among different brain regions. In this paper, we consider the problem of learning functional brain connectivity from neuroimaging, which holds great promise for identifying image-based markers used to distinguish Normal Controls (NC), patients with Mild Cognitive Impairment (MCI), and patients with AD. More specifically, we study sparse inverse covariance estimation (SICE), also known as exploratory Gaussian graphical models, for brain connectivity modeling. In particular, we apply SICE to learn and analyze functional brain connectivity patterns from different subject groups, based on a key property of SICE, called the "monotone property" we established in this paper. Our experimental results on neuroimaging PET data of 42 AD, 116 MCI, and 67 NC subjects reveal several interesting connectivity patterns consistent with literature findings, and also some new patterns that can help the knowledge discovery of AD.
Bayesian Belief Polarization
Jern, Alan, Chang, Kai-min, Kemp, Charles
Empirical studies have documented cases of belief polarization, where two people withopposing prior beliefs both strengthen their beliefs after observing the same evidence. Belief polarization is frequently offered as evidence of human irrationality, but we demonstrate that this phenomenon is consistent with a fully Bayesian approach to belief revision. Simulation results indicate that belief polarization isnot only possible but relatively common within the set of Bayesian models that we consider. Suppose that Carol has requested a promotion at her company and has received a score of 50 on an aptitude test. Alice, one of the company's managers, began with a high opinion of Carol and became even more confident of her abilities after seeing her test score.
Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes
Calderhead, Ben, Girolami, Mark, Lawrence, Neil D.
Identification and comparison of nonlinear dynamical systems using noisy and sparse experimental data is a vital task in many fields, however current methods are computationally expensive and prone to error due in part to the nonlinear nature of the likelihood surfaces induced. We present an accelerated sampling procedure which enables Bayesian inference of parameters in nonlinear ordinary and delay differential equations via the novel use of Gaussian processes (GP). Our method involves GP regression over time-series data, and the resulting derivative and time delay estimates make parameter inference possible without solving the dynamical system explicitly, resulting in dramatic savings of computational time. We demonstrate the speed and statistical accuracy of our approach using examples of both ordinary and delay differential equations, and provide a comprehensive comparison with current state of the art methods.
A Biologically Plausible Model for Rapid Natural Scene Identification
Ghebreab, Sennay, Scholte, Steven, Lamme, Victor, Smeulders, Arnold
Contrast statistics of the majority of natural images conform to a Weibull distribution. This property of natural images may facilitate efficient and very rapid extraction of a scenes visual gist. Here we investigate whether a neural response model based on the Weibull contrast distribution captures visual information that humans use to rapidly identify natural scenes. In a learning phase, we measure EEG activity of 32 subjects viewing brief flashes of 800 natural scenes. From these neural measurements and the contrast statistics of the natural image stimuli, we derive an across subject Weibull response model. We use this model to predict the responses to a large set of new scenes and estimate which scene the subject viewed by finding the best match between the model predictions and the observed EEG responses. In almost 90 percent of the cases our model accurately predicts the observed scene. Moreover, in most failed cases, the scene mistaken for the observed scene is visually similar to the observed scene itself. These results suggest that Weibull contrast statistics of natural images contain a considerable amount of scene gist information to warrant rapid identification of natural images.