TBA by Andrew Campbell
- Series
- Stochastics Seminar
- Time
- Thursday, April 13, 2023 - 15:30 for 1 hour (actually 50 minutes)
- Location
- Skiles 006
- Speaker
- Andrew Campbell – University of Colorado
The estimation of distributions of complex objects from high-dimensional data with low-dimensional structures is an important topic in statistics and machine learning. Deep generative models achieve this by encoding and decoding data to generate synthetic realistic images and texts. A key aspect of these models is the extraction of low-dimensional latent features, assuming data lies on a low-dimensional manifold. We study this by developing a minimax framework for distribution estimation on unknown submanifolds with smoothness assumptions on the target distribution and the manifold. The framework highlights how problem characteristics, such as intrinsic dimensionality and smoothness, impact the limits of high-dimensional distribution estimation. Our estimator, which is a mixture of locally fitted generative models, is motivated by differential geometry techniques and covers cases where the data manifold lacks a global parametrization.
Zoom link to the talk: https://gatech.zoom.us/j/91558578481
In this talk, we will consider stochastic processes on (random) graphs. They arise naturally in epidemiology, statistical physics, computer science and engineering disciplines. In this set-up, the vertices are endowed with a local state (e.g., immunological status in case of an epidemic process, opinion about a social situation). The local state changes dynamically as the vertex interacts with its neighbours. The interaction rules and the graph structure depend on the application-specific context. We will discuss (non-equilibrium) approximation methods for those systems as the number of vertices grow large. In particular, we will discuss three different approximations in this talk: i) approximate lumpability of Markov processes based on local symmetries (local automorphisms) of the graph, ii) functional laws of large numbers in the form of ordinary and partial differential equations, and iii) functional central limit theorems in the form of Gaussian semi-martingales. We will also briefly discuss how those approximations could be used for practical purposes, such as parameter inference from real epidemic data (e.g., COVID-19 in Ohio), designing efficient simulation algorithms etc.
In this continuing joint work with Benjamin Arras, we explore connections between covariance representations and Stein's method. In particular, via Stein's kernels we obtain quantitative high-dimensional CLTs in 1-Wasserstein distance when the limiting Gaussian probability measure is anisotropic. The dependency on the parameters is completely explicit and the rates of convergence are sharp.
The problem of estimation of smooth functionals of unknown parameters of statistical models will be discussed in the cases of high-dimensional log-concave location models (joint work with Martin Wahl) and infinite dimensional Gaussian models with unknown covariance operator. In both cases, the minimax optimal error rates have been obtained in the classes of H\”older smooth functionals with precise dependence on the sample size, the complexity of the parameter (its dimension in the case of log-concave location models or the effective rank of the covariance in the case of Gaussian models) and on the degree of smoothness of the functionals. These rates are attained for different types of estimators based on two different methods of bias reduction in functional estimation.
Independent component analysis is a useful and general data analysis tool. It has found great successes in many applications. But in recent years, it has been observed that many popular approaches to ICA do not scale well with the number of components. This debacle has inspired a growing number of new proposals. But it remains unclear what the exact role of the number of components is on the information theoretical limits and computational complexity for ICA. Here I will describe our recent work to specifically address these questions and introduce a refined method of moments that is both computationally tractable and statistically optimal.
Zoom link to the seminar: https://gatech.zoom.us/j/91330848866
I will show how to construct a numerical scheme for solutions to linear Dirichlet-Poisson boundary problems which does not suffer of the curse of dimensionality. In fact we show that as the dimension increases, the complexity of this scheme increases only (low degree) polynomially with the dimension. The key is a subtle use of walk on spheres combined with a concentration inequality. As a byproduct we show that this result has a simple consequence in terms of neural networks for the approximation of the solution. This is joint work with Iulian Cimpean, Arghir Zarnescu, Lucian Beznea and Oana Lupascu.
We consider a random walk on the $d\ge 3$ dimensional discrete torus starting from vertices chosen independently and uniformly at random. In this talk, we discuss the fluctuation behavior of the size of the range of the random walk trajectories at a time proportional to the size of the torus. The proof relies on a refined analysis of tail estimates for hitting time. We also discuss related results and open problems. This is based on joint work with Partha Dey.