Stein kernels, functional inequalities and applications in statistics

Thursday, April 6, 2023 - 15:30 for 1 hour (actually 50 minutes)
ONLINE via Zoom
Adrien SaumardENSAI and CREST

We will present the notion of Stein kernel, which provides generalizations of the integration by parts, a.k.a. Stein's formula, for the normal distribution (which has a constant Stein kernel, equal to its covariance). We will first focus on dimension one, where under good conditions the Stein kernel has an explicit formula. We will see that the Stein kernel appears naturally as a weighting of a Poincaré type inequality and that it enables precise concentration inequalities, of the Mills' ratio type. In a second part, we will work in higher dimensions, using in particular Max Fathi's construction of a Stein kernel through the so-called "moment maps" transportation. This will allow us to describe the performance of some shrinkage and thresholding estimators, beyond the classical assumption of Gaussian (or spherical) data. This presentation is mostly based on joint works with Max Fathi, Larry Goldstein, Gesine Reinert and Jon Wellner.

The sample complexity of learning transport maps

Thursday, March 30, 2023 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Philippe RigolletMassachusetts Institute of Technology

Optimal transport has recently found applications in a variety of fields ranging from graphics to biology. Underlying these applications is a new statistical paradigm where the goal is to couple multiple data sources. It gives rise to interesting new questions ranging from the design of estimators to minimax rates of convergence. I will review several applications where the central problem consists in estimating transport maps. After studying optimal transport as a potential solution, I will argue that its entropic version is a good alternative model. In particular, it completely escapes the curse of dimensionality that plagues statistical optimal transport.

Implicit estimation of high-dimensional distributions using generative models

Thursday, March 16, 2023 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Yun YangUniversity of Illinois Urbana-Champaign

The estimation of distributions of complex objects from high-dimensional data with low-dimensional structures is an important topic in statistics and machine learning. Deep generative models achieve this by encoding and decoding data to generate synthetic realistic images and texts. A key aspect of these models is the extraction of low-dimensional latent features, assuming data lies on a low-dimensional manifold. We study this by developing a minimax framework for distribution estimation on unknown submanifolds with smoothness assumptions on the target distribution and the manifold. The framework highlights how problem characteristics, such as intrinsic dimensionality and smoothness, impact the limits of high-dimensional distribution estimation. Our estimator, which is a mixture of locally fitted generative models, is motivated by differential geometry techniques and covers cases where the data manifold lacks a global parametrization. 

Large-graph approximations for interacting particles on graphs and their applications

Thursday, March 2, 2023 - 15:30 for 1 hour (actually 50 minutes)
Wasiur KhudaBukhshUniversity of Nottingham

In this talk, we will consider stochastic processes on (random) graphs. They arise naturally in epidemiology, statistical physics, computer science and engineering disciplines. In this set-up, the vertices are endowed with a local state (e.g., immunological status in case of an epidemic process, opinion about a social situation). The local state changes dynamically as the vertex interacts with its neighbours. The interaction rules and the graph structure depend on the application-specific context. We will discuss (non-equilibrium) approximation methods for those systems as the number of vertices grow large. In particular, we will discuss three different approximations in this talk: i) approximate lumpability of Markov processes based on local symmetries (local automorphisms) of the graph, ii) functional laws of large numbers in the form of ordinary and partial differential equations, and iii) functional central limit theorems in the form of Gaussian semi-martingales. We will also briefly discuss how those approximations could be used for practical purposes, such as parameter inference from real epidemic data (e.g., COVID-19 in Ohio), designing efficient simulation algorithms etc.

Covariance Representations, Stein's Kernels and High Dimensional CLTs

Thursday, February 23, 2023 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Christian HoudréGeorgia Tech

In this continuing joint work with Benjamin Arras, we explore connections between covariance representations and Stein's method. In particular,  via Stein's kernels we obtain quantitative high-dimensional CLTs in 1-Wasserstein distance when the limiting Gaussian probability measure is anisotropic. The dependency on the parameters is completely explicit and the rates of convergence are sharp.

Estimation of smooth functionals in high-dimensional and infinite-dimensional models

Thursday, February 16, 2023 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Vladimir KoltchinskiiGeorgia Tech

The problem of estimation of smooth functionals of unknown parameters of statistical models will be discussed in the cases of high-dimensional log-concave location models (joint work with Martin Wahl) and infinite dimensional Gaussian models with unknown covariance operator. In both cases, the minimax optimal error rates have been obtained in the classes of H\”older smooth functionals with precise dependence on the sample size, the complexity of the parameter (its dimension in the case of log-concave location models or the effective rank of the covariance in the case of Gaussian models)  and on the degree of smoothness of the functionals. These rates are attained for different types of estimators based on two different methods of bias reduction in functional estimation.

Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability

Thursday, November 17, 2022 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Ming YuanColumbia University

Independent component analysis is a useful and general data analysis tool. It has found great successes in many applications. But in recent years, it has been observed that many popular approaches to ICA do not scale well with the number of components. This debacle has inspired a growing number of new proposals. But it remains unclear what the exact role of the number of components is on the information theoretical limits and computational complexity for ICA. Here I will describe our recent work to specifically address these questions and introduce a refined method of moments that is both computationally tractable and statistically optimal.

Breaking the curse of dimensionality for boundary value PDE in high dimensions

Thursday, November 10, 2022 - 15:30 for 1 hour (actually 50 minutes)
Ionel PopescuUniversity of Bucharest and Simion Stoilow Institute of Mathematics

I will show how to construct a numerical scheme for solutions to linear Dirichlet-Poisson boundary problems which does not suffer of the curse of dimensionality. In fact we show that as the dimension increases, the complexity of this  scheme increases only (low degree) polynomially with the dimension. The key is a subtle use of walk on spheres combined with a concentration inequality. As a byproduct we show that this result has a simple consequence in terms of neural networks for the approximation of the solution. This is joint work with Iulian Cimpean, Arghir Zarnescu, Lucian Beznea and Oana Lupascu.

Fluctuation results for size of the vacant set for random walks on discrete torus

Thursday, November 3, 2022 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Daesung KimGeorgia Tech

We consider a random walk on the $d\ge 3$ dimensional discrete torus starting from vertices chosen independently and uniformly at random. In this talk, we discuss the fluctuation behavior of the size of the range of the random walk trajectories at a time proportional to the size of the torus. The proof relies on a refined analysis of tail estimates for hitting time. We also discuss related results and open problems. This is based on joint work with Partha Dey.

Ballistic Annihilation

Thursday, October 27, 2022 - 15:30 for 1 hour (actually 50 minutes)
Skiles 006
Matthew JungeBaruch College, CUNY

In the late 20th century, statistical physicists introduced a chemical reaction model called ballistic annihilation. In it, particles are placed randomly throughout the real line and then proceed to move at independently sampled velocities. Collisions result in mutual annihilation. Many results were inferred by physicists, but it wasn’t until recently that mathematicians joined in. I will describe my trajectory through this model. Expect tantalizing open questions.
