Seminars and Colloquia by Series

Estimation of trace functionals of covariance operators

Series
Stochastics Seminar
Time
Thursday, August 29, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Vladimir KoltchinskiiGeorgia Tech

We will discuss a problem of estimation of functionals of the form $\tau_f(\Sigma):= {\rm tr} (f(\Sigma))$ of unknown covariance operator $\Sigma$ of a centered Gaussian random variable $X$ in a separable Hilbert space ${\mathbb H}$ based on i.i.d. observation $X_1,\dots, X_n$ of $X,$ where $f:{\mathbb R}\mapsto {\mathbb R}$ is a given function. A naive plug-in estimator $\tau_f(\hat \Sigma_n)$ based on the sample covariance operator $\hat \Sigma_n$ has a large bias and bias reduction methods are needed to construct estimators with better error rates. We develop estimators with reduced bias based on linear aggregation of several plug-in estimators with different sample sizes and obtain the error bounds for such estimators with explicit dependence on the sample size $n,$ the effective rank ${\bf r}(\Sigma)= \frac{tr(\Sigma)}{\|\Sigma\|}$ of covariance operator $\Sigma$ and the degree of smoothness of function $f.$

Asymptotic mutual information for quadratic estimation problems over compact groups

Series
Stochastics Seminar
Time
Thursday, August 22, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Timothy WeeGeorgia Tech

Motivated by applications to group synchronization and quadratic assignment on random data, we study a general problem of Bayesian inference of an unknown “signal” belonging to a high-dimensional compact group, given noisy pairwise observations of a featurization of this signal.


We establish a quantitative comparison between the signal-observation mutual information in any such problem with that in a simpler model with linear observations, using interpolation methods. For group synchronization, our result proves a replica formula for the asymptotic mutual information and Bayes-optimal mean-squared error. Via analyses of this replica formula, we show that the conjectural phase transition threshold for computationally-efficient weak recovery of the signal is determined by a classification of the real-irreducible components of the observed group representation(s), and we fully characterize the information-theoretic limits of estimation in the example of angular/phase synchronization over SO(2)/U(1). For quadratic assignment, we study observations given by a kernel matrix of pairwise similarities and a randomly permuted and noisy counterpart, and we show in a bounded signal-to-noise regime that the asymptotic mutual information coincides with that in a Bayesian spiked model with i.i.d. signal prior.


This is based on joint work with Kaylee Yang and Zhou Fan.

Max-sliced Wasserstein distances

Series
Stochastics Seminar
Time
Thursday, April 25, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
March BoedihardjoMichigan State University

I will give essentially matching upper and lower bounds for the expected max-sliced 1-Wasserstein distance between a probability measure on a separable Hilbert space and its empirical distribution from n samples. A version of this result for Banach spaces will also be presented. From this, we will derive an upper bound for the expected max-sliced 2-Wasserstein distance between a symmetric probability measure on a Euclidean space and its symmetrized empirical distribution.

Branching Brownian motion and the road-field model

Series
Stochastics Seminar
Time
Thursday, April 18, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Nick CookDuke University

The Fisher-KPP equation was introduced in 1937 to model the spread of an advantageous gene through a spatially distributed population. Remarkably precise information on the traveling front has been obtained via a connection with branching Brownian motion, beginning with works of McKean and Bramson in the 70s. I will discuss an extension of this probabilistic approach to the Road-Field Model: a reaction-diffusion PDE system introduced by H. Berestycki et al. to describe enhancement of biological invasions by a line of fast diffusion, such as a river or a road. Based on joint work with Amir Dembo.

 

From Ehrhard to Generalized Bobkov inequality, and more

Series
Stochastics Seminar
Time
Thursday, April 11, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Galyna LivshytsGeorgia Tech

We discuss a general scheme that allows to realize certain geometric functional inequalities as statements about convexity of some functionals, and, inspired by the work of Bobkov and Ledoux, we obtain various interesting inequalities as their realizations. For example, we draw a link between Ehrhard’s inequality and an interesting extension of Bobkov’s inequality, and several new and more general inequalities are discussed as well. In this talk we discuss a joint project with Barthe, Cordero-Erausquin and Ivanisvili, and also mention briefly some results from a joint project with Cordero-Erausquin and Rotem.

Local vs Non-Local Poincar\'e Inequalities and Quantitative Exponential Concentration

Series
Stochastics Seminar
Time
Thursday, April 4, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Christian HoudréGeorgia Institute of Technology

Weighted Poincar\'e inequalities known for various laws such as the exponential or Cauchy ones are shown to follow from the "usual"  Poincar\'e inequality involving the non-local gradient.  A key ingredient in showing so is a covariance representation and Hardy's inequality.  

The framework under study is quite general and comprises infinitely divisible laws as well as some log-concave ones.  This same covariance representation is then used to obtain quantitative concentration inequalities of exponential type, recovering in particular the Gaussian results.  

Joint Work with Benjamin Arras.  

Improving Predictions by Combining Models

Series
Stochastics Seminar
Time
Thursday, March 28, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Jason KlusowskiPrinceton University

When performing regression analysis, researchers often face the challenge of selecting the best single model from a range of possibilities. Traditionally, this selection is based on criteria evaluating model goodness-of-fit and complexity, such as Akaike's AIC and Schwartz's BIC, or on the model's performance in predicting new data, assessed through cross-validation techniques. In this talk, I will show that a linear combination of a large number of these possible models can have better predictive accuracy than the best single model among them. Algorithms and theoretical guarantees will be discussed, which involve interesting connections to constrained optimization and shrinkage in statistics.

Optimal transport map estimation in general function spaces

Series
Stochastics Seminar
Time
Thursday, March 14, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Jonathan Niles-WeedNew York University

We present a unified methodology for obtaining rates of estimation of optimal transport maps in general function spaces. Our assumptions are significantly weaker than those appearing in the literature: we require only that the source measure P satisfy a Poincare inequality and that the optimal map be the gradient of a smooth convex function that lies in a space whose metric entropy can be controlled. As a special case, we recover known estimation rates for Holder transport maps, but also obtain nearly sharp results in many settings not covered by prior work. For example, we provide the first statistical rates of estimation when P is the normal distribution, between log-smooth and strongly log-concave distributions, and when the transport map is given by an infinite-width shallow neural network. (joint with Vincent Divol and Aram-Alexandre Pooladian.)

 

Large deviations for the top eigenvalue of deformed random matrices

Series
Stochastics Seminar
Time
Wednesday, March 6, 2024 - 13:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Benjamin McKennaHarvard University

In recent years, the few classical results in large deviations for random matrices have been complemented by a variety of new ones, in both the math and physics literatures, whose proofs leverage connections with Harish-Chandra/Itzykson/Zuber integrals. We present one such result, focusing on extreme eigenvalues of deformed sample-covariance and Wigner random matrices. This confirms recent formulas of Maillard (2020) in the physics literature, precisely locating a transition point whose analogue in non-deformed models is not yet fully understood. Joint work with Jonathan Husson.

Load Balancing under Data Locality: Extending Mean-Field Framework to Constrained Large-Scale Systems

Series
Stochastics Seminar
Time
Thursday, February 29, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Debankur MukherjeeGeorgia Tech

Large-scale parallel-processing infrastructures such as data centers and cloud networks form the cornerstone of the modern digital environment. Central to their efficiency are resource management policies, especially load balancing algorithms (LBAs), which are crucial for meeting stringent delay requirements of tasks. A contemporary challenge in designing LBAs for today's data centers is navigating data locality constraints that dictate which tasks are assigned to which servers. These constraints can be naturally modeled as a bipartite graph between servers and various task types. Most LBA heuristics lean on the mean-field approximation's accuracy. However, the non-exchangeability among servers induced by the data locality invalidates this mean-field framework, causing real-world system behaviors to significantly diverge from theoretical predictions. From a foundational standpoint, advancing our understanding in this domain demands the study of stochastic processes on large graphs, thus needing fundamental advancements in classical analytical tools.

In this presentation, we will delve into recent advancements made in extending the accuracy of mean-field approximation for a broad class of graphs. In particular, we will talk about how to design resource-efficient, asymptotically optimal data locality constraints and how the system behavior changes fundamentally, depending on whether the above bipartite graph is an expander, a spatial graph, or is inhomogeneous in nature.

Pages