Seminars and Colloquia by Series

When Do Neural Networks Outperform Kernel Methods?

Series
Stochastics Seminar
Time
Thursday, November 12, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/445382510
Speaker
Song MeiUC Berkeley

For a certain scaling of the initialization of stochastic gradient descent (SGD), wide neural networks (NN) have been shown to be well approximated by reproducing kernel Hilbert space (RKHS)  methods. Recent empirical work showed that, for some classification tasks, RKHS methods can replace NNs without a large loss in performance. On the other hand, two-layers NNs are known to encode richer smoothness classes than RKHS and we know of special examples for which SGD-trained NN provably outperform RKHS. This is true also in the wide network limit, for a different scaling of the initialization.

How can we reconcile the above claims? For which tasks do NNs outperform RKHS? If feature vectors are nearly isotropic, RKHS methods suffer from the curse of dimensionality, while NNs can overcome it by learning the best low-dimensional representation. Here we show that this curse of dimensionality becomes milder if the feature vectors display the same low-dimensional structure as the target function, and we precisely characterize this tradeoff. Building on these results, we present a model that can capture in a unified framework both behaviors observed in earlier work. We hypothesize that such a latent low-dimensional structure is present in image classification. We test numerically this hypothesis by showing that specific perturbations of the training distribution degrade the performances of RKHS methods much more significantly than NNs. 

Bias-Variance Tradeoffs in Joint Spectral Embeddings

Series
Stochastics Seminar
Time
Thursday, November 5, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/974631214
Speaker
Daniel SussmanBoston University

We consider the ramifications of utilizing biased latent position estimates in subsequent statistical analysis in exchange for sizable variance reductions in finite networks. We establish an explicit bias-variance tradeoff for latent position estimates produced by the omnibus embedding in the presence of heterogeneous network data. We reveal an analytic bias expression, derive a uniform concentration bound on the residual term, and prove a central limit theorem characterizing the distributional properties of these estimates.

Link to the BlueJeans meeting https://bluejeans.com/974631214

Higher-order fluctuations in dense random graph models (note the unusual time: 5pm)

Series
Stochastics Seminar
Time
Thursday, October 22, 2020 - 17:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/751242993/PASSWORD (To receive the password, please email Lutz Warnke)
Speaker
Adrian RoellinNational University of Singapore

Dense graph limit theory is essentially a first-order limit theory analogous to the classical Law of Large Numbers. Is there a corresponding central limit theorem? We believe so. Using the language of Gaussian Hilbert Spaces and the comprehensive theory of generalised U-statistics developed by Svante Janson in the 90s, we identify a collection of Gaussian measures (aka white noise processes) that describes the fluctuations of all orders of magnitude for a broad family of random graphs. We complement the theory with error bounds using a new variant of Stein’s method for multivariate normal approximation, which allows us to also generalise Janson’s theory in some important aspects. This is joint work with Gursharn Kaur.

Please note the unusual time: 5pm

Coalescence estimates for the corner growth model with exponential weights

Series
Stochastics Seminar
Time
Thursday, October 15, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
Bluejeans (link to be sent)
Speaker
Xiao ShenUniversity of Wisconsin

(Joint work with Timo Seppäläinen) We establish estimates for the coalescence time of semi-infinite directed geodesics in the planar corner growth model with i.i.d. exponential weights. There are four estimates: upper and lower bounds on the probabilities of both fast and slow coalescence on the correct spatial scale with exponent 3/2. Our proofs utilize a geodesic duality introduced by Pimentel and properties of the increment-stationary last-passage percolation process. For fast coalescence our bounds are new and they have matching optimal exponential order of magnitude. For slow coalescence, we reproduce bounds proved earlier with integrable probability inputs, except that our upper bound misses the optimal order by a logarithmic factor.

Approximate Kernel Principal Component Analysis: Computational vs. Statistical Trade-off

Series
Stochastics Seminar
Time
Thursday, October 8, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://gatech.webex.com/gatech/j.php?MTID=mdd4512d3d11623149a0bd46d9fc086c8
Speaker
Bharath SriperumbudurPennsylvania State University

Kernel principal component analysis (KPCA) is a popular non-linear dimensionality reduction technique, which generalizes classical linear PCA by finding functions in a reproducing kernel Hilbert space (RKHS) such that the function evaluation at a random variable $X$ has a maximum variance. Despite its popularity, kernel PCA suffers from poor scalability in big data scenarios as it involves solving a $n \times  n$ eigensystem leading to the computational complexity of $O(n^3)$ with $n$ being the number of samples. To address this issue, in this work, we consider a random feature approximation to kernel PCA which requires solving an $m \times m$ eigenvalue problem and therefore has a computational complexity of $O(m^3+nm^2)$, implying that the approximate method is computationally efficient if $m$ < $n$ with $m$ being the number of random features. The goal of this work is to investigate the trade-off between computational and statistical behaviors of approximate KPCA, i.e., whether the computational gain is achieved at the cost of statistical efficiency. We show that the approximate KPCA is both computationally and statistically efficient compared to KPCA in terms of the error associated with reconstructing a kernel function based on its projection onto the corresponding eigenspaces.

Link to Cisco Webex meeting: https://gatech.webex.com/gatech/j.php?MTID=mdd4512d3d11623149a0bd46d9fc086c8

A precise high-dimensional theory for Boosting

Series
Stochastics Seminar
Time
Thursday, October 1, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/276389634
Speaker
Pragya SurHarvard University

This talk will introduce a precise high-dimensional asymptotic theory for Boosting (AdaBoost) on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we will provide a sharp analysis of the stopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of Boosting; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. At the heart of our theory lies an in-depth study of the maximum L1-margin, which can be accurately described by a new system of non-linear equations; we analyze this margin and the properties of this system, using Gaussian comparison techniques and a novel uniform deviation argument. Time permitting, I will present analogous results for a new class of boosting algorithms that correspond to Lq geometry, for q>1. This is based on joint work with Tengyuan Liang.

Statistical Inference in Popularity Adjusted Stochastic Block Model

Series
Stochastics Seminar
Time
Thursday, September 24, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://ucf.zoom.us/j/92646603521?pwd=TnRGSVo1WXo2bjE4Y3JEVGRPSmNWQT09
Speaker
Marianna PenskyUniversity of Central Florida

The talk considers the Popularity Adjusted Block model (PABM) introduced by Sengupta and Chen (2018). We argue that the main appeal of the PABM is the flexibility of the spectral properties of the graph which makes the PABM an attractive choice for modeling networks that appear in, for example, biological sciences. In addition, to the best of our knowledge, the PABM is the only stochastic block model that allows to treat the network sparsity as the structural sparsity that describes community patterns, rather than being an attribute of the network as a whole.

Link to Zoom meeting: https://ucf.zoom.us/j/92646603521?pwd=TnRGSVo1WXo2bjE4Y3JEVGRPSmNWQT09

Couplings of Markov chain Monte Carlo and their uses

Series
Stochastics Seminar
Time
Thursday, September 10, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://us02web.zoom.us/j/83378796301
Speaker
Pierre JacobHarvard University

Markov chain Monte Carlo (MCMC) methods are state-of-the-art techniques for numerical integration. MCMC methods yield estimators that converge to integrals of interest in the limit of the number of iterations, obtained from Markov chains that converge to stationarity. This iterative asymptotic justification is not ideal. Indeed the literature offers little practical guidance on how many iterations should be performed, despite decades of research on the topic. This talk will describe a computational approach to address some of these issues. The key idea, pioneered by Glynn and Rhee in 2014, is to generate couplings of Markov chains, whereby pairs of chains contract, coalesce or even "meet" after a random number of iterations; we will see that these meeting times, which can be simulated in many practical settings, contain useful information about the finite-time marginal distributions of the chains. This talk will provide an overview of this line of research, joint work with John O'Leary, Yves Atchadé and various collaborators.
The main reference is available here: https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12336

Eigenvectors' overlaps for integrable models of non-Hermitian random matrices

Series
Stochastics Seminar
Time
Thursday, April 2, 2020 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Guillaume Dubach

Right and left eigenvectors of non-Hermitian matrices form a bi-orthogonal system to which one can associate homogeneous quantities known as overlaps. The matrix of overlaps quantifies the stability of the spectrum and characterizes the joint eigenvalues increments under Dyson-type dynamics. Overlaps first appeared in the physics literature: Chalker and Mehlig calculated their conditional expectation for complex Ginibre matrices (1998). For the same model, we extend their results by deriving the distribution of the overlaps and their correlations (joint work with P. Bourgade). Similar results can be obtained for quaternionic Gaussian matrices, as well as matrices from the spherical and truncated-unitary ensembles.

Pages