Seminars and Colloquia by Series

Optimal Ranking Recovery from Pairwise Comparisons

Series
Stochastics Seminar
Time
Thursday, April 1, 2021 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/129119189
Speaker
Anderson Y. ZhangUniversity of Pennsylvania

Ranking from pairwise comparisons is a central problem in a wide range of learning and social contexts. Researchers in various disciplines have made significant methodological and theoretical contributions to it. However, many fundamental statistical properties remain unclear especially for the recovery of ranking structure. This talk presents two recent projects towards optimal ranking recovery, under the Bradley-Terry-Luce (BTL) model.

In the first project, we study the problem of top-k ranking. That is, to optimally identify the set of top-k players. We derive the minimax rate and show that it can be achieved by MLE. On the other hand, we show another popular algorithm, the spectral method, is in general suboptimal.

In the second project, we study the problem of full ranking among all players. The minimax rate exhibits a transition between an exponential rate and a polynomial rate depending on the magnitude of the signal-to-noise ratio of the problem. To the best of our knowledge, this phenomenon is unique to full ranking and has not been seen in any other statistical estimation problem. A divide-and-conquer ranking algorithm is proposed to achieve the minimax rate.

Large Values of the Riemann Zeta Function in Small Intervals

Series
Stochastics Seminar
Time
Thursday, February 25, 2021 - 15:30 for 1 hour (actually 50 minutes)
Location
ONLINE
Speaker
Louis-Pierre ArguinBaruch College, CUNY

I will give an account of the recent progress in probability and in number theory to understand the large values of the zeta function in small intervals of the critical line. This problem has interesting connections with the extreme value statistics of IID and log-correlated random variables.

Lower bounds for the estimation of principal components

Series
Stochastics Seminar
Time
Thursday, February 4, 2021 - 15:30 for 1 hour (actually 50 minutes)
Location
ONLINE
Speaker
Martin WahlHumboldt University in Berlin

This talk will be concerned with nonasymptotic lower bounds for the estimation of principal subspaces. I will start by reviewing some previous methods, including the local asymptotic minimax theorem and the Grassmann approach. Then I will present a new approach based on a van Trees inequality (i.e. a Bayesian version of the Cramér-Rao inequality) tailored for invariant statistical models. As applications, I will provide nonasymptotic lower bounds for principal component analysis and the matrix denoising problem, two examples that are invariant with respect to the orthogonal group. These lower bounds are characterized by doubly substochastic matrices whose entries are bounded by the different Fisher information directions, confirming recent upper bounds in the context of the empirical covariance operator.

Seminar link: https://bluejeans.com/129119189

The Bulk and the Extremes of Minimal Spanning Acycles and Persistence Diagrams of Random Complexes

Series
Stochastics Seminar
Time
Thursday, January 21, 2021 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/751242993/PASSWORD (To receive the password, please email Lutz Warnke)
Speaker
Sayan MukherjeeDuke University

Frieze showed that the expected weight of the minimum spanning tree (MST) of the uniformly weighted graph converges to ζ(3). Recently, this result was extended to a uniformly weighted simplicial complex, where the role of the MST is played by its higher-dimensional analogue -- the Minimum Spanning Acycle (MSA). In this work, we go beyond and look at the histogram of the weights in this random MSA -- both in the bulk and in the extremes. In particular, we focus on the `incomplete' setting, where one has access only to a fraction of the potential face weights. Our first result is that the empirical distribution of the MSA weights asymptotically converges to a measure based on the shadow -- the complement of graph components in higher dimensions. As far as we know, this result is the first to explore the connection between the MSA weights and the shadow. Our second result is that the extremal weights converge to an inhomogeneous Poisson point process. A interesting consequence of our two results is that we can also state the distribution of the death times in the persistence diagram corresponding to the above weighted complex, a result of interest in applied topology.

Based on joint work with Nicolas Fraiman and Gugan Thoppe, see https://arxiv.org/abs/2012.14122

A Lévy-driven process with matrix scaling exponent

Series
Stochastics Seminar
Time
Thursday, December 3, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/504188361
Speaker
B. Cooper BonieceWashington University in St. Louis

In the past several decades, scale invariant stochastic processes have been used in a wide range of applications including internet traffic modeling and hydrology.  However, by comparison to univariate scale invariance, far less attention has been paid to characteristically multivariate models that display aspects of scaling behavior the limit theory arguably suggests is most natural.
 
In this talk, I will introduce a new scale invariance model called operator fractional Lévy motion and discuss some of its interesting features, as well as some aspects of wavelet-based estimation of its scaling exponents. This is related to joint work with Gustavo Didier (Tulane University), Herwig Wendt (CNRS, IRIT Univ. of Toulouse) and Patrice Abry (CNRS, ENS-Lyon).

New Classes of Multivariate Covariance Functions

Series
Stochastics Seminar
Time
Thursday, November 19, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://gatech.webex.com/gatech/j.php?MTID=mee147c52d7a4c0a5172f60998fee267a
Speaker
Tatiyana ApanasovichGeorge Washington University

The class which is refereed to as the Cauchy family allows for the simultaneous modeling of the long memory dependence and correlation at short and intermediate lags. We introduce a valid parametric family of cross-covariance functions for multivariate spatial random fields where each component has a covariance function from a Cauchy family. We present the conditions on the parameter space that result in valid models with varying degrees of complexity. Practical implementations, including reparameterizations to reflect the conditions on the parameter space will be discussed. We show results of various Monte Carlo simulation experiments to explore the performances of our approach in terms of estimation and cokriging. The application of the proposed multivariate Cauchy model is illustrated on a dataset from the field of Satellite Oceanography.

Link to Cisco Webex meeting: https://gatech.webex.com/gatech/j.php?MTID=mee147c52d7a4c0a5172f60998fee267a

When Do Neural Networks Outperform Kernel Methods?

Series
Stochastics Seminar
Time
Thursday, November 12, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/445382510
Speaker
Song MeiUC Berkeley

For a certain scaling of the initialization of stochastic gradient descent (SGD), wide neural networks (NN) have been shown to be well approximated by reproducing kernel Hilbert space (RKHS)  methods. Recent empirical work showed that, for some classification tasks, RKHS methods can replace NNs without a large loss in performance. On the other hand, two-layers NNs are known to encode richer smoothness classes than RKHS and we know of special examples for which SGD-trained NN provably outperform RKHS. This is true also in the wide network limit, for a different scaling of the initialization.

How can we reconcile the above claims? For which tasks do NNs outperform RKHS? If feature vectors are nearly isotropic, RKHS methods suffer from the curse of dimensionality, while NNs can overcome it by learning the best low-dimensional representation. Here we show that this curse of dimensionality becomes milder if the feature vectors display the same low-dimensional structure as the target function, and we precisely characterize this tradeoff. Building on these results, we present a model that can capture in a unified framework both behaviors observed in earlier work. We hypothesize that such a latent low-dimensional structure is present in image classification. We test numerically this hypothesis by showing that specific perturbations of the training distribution degrade the performances of RKHS methods much more significantly than NNs. 

Bias-Variance Tradeoffs in Joint Spectral Embeddings

Series
Stochastics Seminar
Time
Thursday, November 5, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/974631214
Speaker
Daniel SussmanBoston University

We consider the ramifications of utilizing biased latent position estimates in subsequent statistical analysis in exchange for sizable variance reductions in finite networks. We establish an explicit bias-variance tradeoff for latent position estimates produced by the omnibus embedding in the presence of heterogeneous network data. We reveal an analytic bias expression, derive a uniform concentration bound on the residual term, and prove a central limit theorem characterizing the distributional properties of these estimates.

Link to the BlueJeans meeting https://bluejeans.com/974631214

Higher-order fluctuations in dense random graph models (note the unusual time: 5pm)

Series
Stochastics Seminar
Time
Thursday, October 22, 2020 - 17:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/751242993/PASSWORD (To receive the password, please email Lutz Warnke)
Speaker
Adrian RoellinNational University of Singapore

Dense graph limit theory is essentially a first-order limit theory analogous to the classical Law of Large Numbers. Is there a corresponding central limit theorem? We believe so. Using the language of Gaussian Hilbert Spaces and the comprehensive theory of generalised U-statistics developed by Svante Janson in the 90s, we identify a collection of Gaussian measures (aka white noise processes) that describes the fluctuations of all orders of magnitude for a broad family of random graphs. We complement the theory with error bounds using a new variant of Stein’s method for multivariate normal approximation, which allows us to also generalise Janson’s theory in some important aspects. This is joint work with Gursharn Kaur.

Please note the unusual time: 5pm

Pages