Seminars and Colloquia by Series

Optimal transport map estimation in general function spaces

Series
Stochastics Seminar
Time
Thursday, March 14, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Jonathan Niles-WeedNew York University

We present a unified methodology for obtaining rates of estimation of optimal transport maps in general function spaces. Our assumptions are significantly weaker than those appearing in the literature: we require only that the source measure P satisfy a Poincare inequality and that the optimal map be the gradient of a smooth convex function that lies in a space whose metric entropy can be controlled. As a special case, we recover known estimation rates for Holder transport maps, but also obtain nearly sharp results in many settings not covered by prior work. For example, we provide the first statistical rates of estimation when P is the normal distribution, between log-smooth and strongly log-concave distributions, and when the transport map is given by an infinite-width shallow neural network. (joint with Vincent Divol and Aram-Alexandre Pooladian.)

 

Large deviations for the top eigenvalue of deformed random matrices

Series
Stochastics Seminar
Time
Wednesday, March 6, 2024 - 13:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Benjamin McKennaHarvard University

In recent years, the few classical results in large deviations for random matrices have been complemented by a variety of new ones, in both the math and physics literatures, whose proofs leverage connections with Harish-Chandra/Itzykson/Zuber integrals. We present one such result, focusing on extreme eigenvalues of deformed sample-covariance and Wigner random matrices. This confirms recent formulas of Maillard (2020) in the physics literature, precisely locating a transition point whose analogue in non-deformed models is not yet fully understood. Joint work with Jonathan Husson.

Load Balancing under Data Locality: Extending Mean-Field Framework to Constrained Large-Scale Systems

Series
Stochastics Seminar
Time
Thursday, February 29, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Debankur MukherjeeGeorgia Tech

Large-scale parallel-processing infrastructures such as data centers and cloud networks form the cornerstone of the modern digital environment. Central to their efficiency are resource management policies, especially load balancing algorithms (LBAs), which are crucial for meeting stringent delay requirements of tasks. A contemporary challenge in designing LBAs for today's data centers is navigating data locality constraints that dictate which tasks are assigned to which servers. These constraints can be naturally modeled as a bipartite graph between servers and various task types. Most LBA heuristics lean on the mean-field approximation's accuracy. However, the non-exchangeability among servers induced by the data locality invalidates this mean-field framework, causing real-world system behaviors to significantly diverge from theoretical predictions. From a foundational standpoint, advancing our understanding in this domain demands the study of stochastic processes on large graphs, thus needing fundamental advancements in classical analytical tools.

In this presentation, we will delve into recent advancements made in extending the accuracy of mean-field approximation for a broad class of graphs. In particular, we will talk about how to design resource-efficient, asymptotically optimal data locality constraints and how the system behavior changes fundamentally, depending on whether the above bipartite graph is an expander, a spatial graph, or is inhomogeneous in nature.

Gradient flows for empirical Bayes in high-dimensional linear models

Series
Stochastics Seminar
Time
Thursday, February 15, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Zhou FanYale University

Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs.

In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonparametric maximum likelihood estimator (NPMLE). We introduce and study a system of gradient flow equations for optimizing the marginal log-likelihood, jointly over the prior and posterior measures in its Gibbs variational representation using a smoothed reparametrization of the regression coefficients. A diffusion-based implementation yields a Langevin dynamics MCEM algorithm, where the prior law evolves continuously over time to optimize a sequence-model log-likelihood defined by the coordinates of the current Langevin iterate.

We show consistency of the NPMLE under mild conditions, including settings of random sub-Gaussian designs under high-dimensional asymptotics. In high noise, we prove a uniform log-Sobolev inequality for the mixing of Langevin dynamics, for possibly misspecified priors and non-log-concave posteriors. We then establish polynomial-time convergence of the joint gradient flow to a near-NPMLE if the marginal negative log-likelihood is convex in a sub-level set of the initialization.

This is joint work with Leying Guan, Yandi Shen, and Yihong Wu.

Permutation limits

Series
Stochastics Seminar
Time
Thursday, November 30, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Sumit MukherjeeColumbia University

Permutation limit theory arises by viewing a permutation as a probability measure on the unit square. Using the theory of permutation limits (permutons), we can compute limiting properties of various permutation statistics for random permutations, such as number of fixed points, number of small cycles, pattern counts, and degree distribution of permutation graphs. We can also derive LDPs for random permutations. Our results apply to many non uniform distributions on permutations, including the celebrated Mallows model, and mu-random permutations. This is based on joint work with Jacopo Borga, Sayan Das and Peter Winkler.

Estimation and Inference in Tensor Mixed-Membership Blockmodels

Series
Stochastics Seminar
Time
Thursday, November 2, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Joshua AgterbergUniversity of Pennsylvania

Higher-order multiway data is ubiquitous in machine learning and statistics and often exhibits community-like structures, where each component (node) along each different mode has a community membership associated with it. In this talk we propose the tensor mixed-membership blockmodel, a generalization of the tensor blockmodel positing that memberships need not be discrete, but instead are convex combinations of latent communities. We first study the problem of estimating community memberships, and we show that a tensor generalization of a matrix algorithm can consistently estimate communities at a rate that improves relative to the matrix setting, provided one takes the tensor structure into account. Next, we study the problem of testing whether two nodes have the same community memberships, and we show that a tensor analogue of a matrix test statistic can yield consistent testing with a tighter local power guarantee relative to the matrix setting. If time permits we will also examine the performance of our estimation procedure on flight data. This talk is based on two recent works with Anru Zhang.

The clique chromatic number of sparse random graphs

Series
Stochastics Seminar
Time
Thursday, October 26, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Manuel FernandezGeorgia Tech

The clique chromatic number of a graph is the smallest number of colors in a vertex coloring so that no inclusion-maximal clique is monochromatic (ignoring isolated vertices). For the binomial random graph G_{n,p} the clique chromatic number has been studied in a number of works since 2016, but for sparse edge-probabilities in the range n^{-2/5} \ll p \ll 1 even the order of magnitude remained a technical challenge.

Resolving open problems of Alon and Krivelevich as well as Lichev, Mitsche and Warnke, we determine the clique chromatic number of the binomial random graph G_{n,p} in most of the missing regime: we show that it is of order (\log n)/p for edge-probabilities n^{-2/5+\eps} \ll p \ll n^{-1/3} and n^{-1/3+\eps} \ll p \ll 1, for any constant \eps > 0. Perhaps surprisingly for a result about random graphs, a key ingredient in the proof is an application of the probabilistic method (that hinges on careful counting and density arguments).

This talk is based on joint work with Lutz Warnke.

Finite-time Convergence Guarantees of Contractive Stochastic Approximation: Mean-Square and Tail Bounds

Series
Stochastics Seminar
Time
Thursday, October 5, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skile 006
Speaker
Siva Theja MaguluriGeorgia Tech

Abstract: Motivated by applications in Reinforcement Learning (RL), this talk focuses on the Stochastic Appproximation (SA) method to find fixed points of a contractive operator. First proposed by Robins and Monro, SA is a popular approach for solving fixed point equations when the information is corrupted by noise. We consider the SA algorithm for operators that are contractive under arbitrary norms (especially the l-infinity norm). We present finite sample bounds on the mean square error, which are established using a Lyapunov framework based on infimal convolution and generalized Moreau envelope. We then present our more recent result on concentration of the tail error, even when the iterates are not bounded by a constant. These tail bounds are obtained using exponential supermartingales in conjunction with the Moreau envelop and a novel bootstrapping approach. Our results immediately imply the state-of-the-art sample complexity results for a large class of RL algorithms.

Bio: Siva Theja Maguluri is Fouts Family Early Career Professor and Associate Professor in the H. Milton Stewart School of Industrial and Systems Engineering at Georgia Tech. He obtained his Ph.D. and MS in ECE as well as MS in Applied Math from UIUC, and B.Tech in Electrical Engineering from IIT Madras. His research interests span the areas of Control, Optimization, Algorithms and Applied Probability and include Reinforcement Learning theory and Stochastic Networks. His research and teaching are recognized through several awards including the  Best Publication in Applied Probability award, NSF CAREER award, second place award at INFORMS JFIG best paper competition, Student best paper award at IFIP Performance, CTL/BP Junior Faculty Teaching Excellence Award, and Student Recognition of Excellence in Teaching: Class of 1934 CIOS Award.

Limit results for distributed estimation of invariant subspaces in multiple networks inference and PCA

Series
Stochastics Seminar
Time
Thursday, September 28, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Minh TangNC State

We study the problem of estimating the left and right singular subspaces for a collection of heterogeneous random graphs with a shared common structure. We analyze an algorithm that first estimates the orthogonal projection matrices corresponding to these subspaces for each individual graph, then computes the average of the projection matrices, and finally finds the matrices whose columns are the eigenvectors corresponding to the d largest eigenvalues of the sample averages. We show that the algorithm yields an estimate of the left and right singular vectors whose row-wise fluctuations are normally distributed around the rows of the true singular vectors. We then consider a two-sample hypothesis test for the null hypothesis that two graphs have the same edge probabilities matrices against the alternative hypothesis that their edge probabilities matrices are different. Using the limiting distributions for the singular subspaces, we present a test statistic whose limiting distribution converges to a central chi-square (resp. non-central chi-square) under the null (resp. alternative) hypothesis. Finally, we adapt the theoretical analysis for multiple networks to the setting of distributed PCA; in particular, we derive normal approximations for the rows of the estimated eigenvectors using distributed PCA when the data exhibit a spiked covariance matrix structure.

Eigenvalues of fractional Brownian matrix process

Series
Stochastics Seminar
Time
Tuesday, September 26, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Victor Pérez-AbreuCIMAT

This talk will present an overview of the behavior of the eigenvalues of the fractional Brownian matrix motion and other related matrix processes. We will do so by emphasizing the dynamics of the eigenvalues processes, the non-colliding property, the limit of the associated empirical process, as well as the free Brownian motion and the non commutative fractional Brownian motion.

Pages