Seminars and Colloquia by Series

How to Break the Curse of Dimensionality

Series
Applied and Computational Mathematics Seminar
Time
Monday, January 31, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Ming-Jun LaiUniversity of Georgia

We first review the problem of the curse of dimensionality when approximating multi-dimensional functions. Several approximation results from Barron, Petrushev,  Bach, and etc . will be explained.

Then we present two approaches to break the curse of the dimensionality: one is based on probability approach explained in Barron, 1993 and the other one is based on a deterministic approach using the Kolmogorov superposition theorem.   As the Kolmogorov superposition theorem has been used to explain the approximation of neural network computation, I will use it to explain why the deep learning algorithm works for image classification.
In addition, I will introduce the neural network approximation based on higher order ReLU functions to explain the powerful approximation of multivariate functions using  deep learning algorithms with  multiple layers.

Non-Parametric Estimation of Manifolds from Noisy Data

Series
Applied and Computational Mathematics Seminar
Time
Monday, December 6, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Yariv AizenbudYale University
A common task in many data-driven applications is to find a low dimensional manifold that describes the data accurately. Estimating a manifold from noisy samples has proven to be a challenging task. Indeed, even after decades of research, there is no (computationally tractable) algorithm that accurately estimates a manifold from noisy samples with a constant level of noise.

In this talk, we will present a method that estimates a manifold and its tangent in the ambient space. Moreover, we establish rigorous convergence rates, which are essentially as good as existing convergence rates for function estimation.

Model-free Feature Screening and FDR Control with Knockoff Features

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 29, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Yuan KeUniversity of Georgia

This paper proposes a model-free and data-adaptive feature screening method for ultra-high dimensional data. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model, and applies to data in the presence of heavy tails and multivariate responses. It enjoys both sure screening and rank consistency properties under weak assumptions.  A two-step approach, with the help of knockoff features, is advocated to specify the threshold for feature screening  such that the false discovery rate (FDR) is controlled under a pre-specified level. The proposed two-step approach enjoys both sure screening and FDR control simultaneously if the pre-specified FDR level is greater or equal to 1/s, where s is the number of active features.  The superior empirical performance of the proposed method is illustrated by simulation examples and real data applications. This is a joint work with Wanjun Liu, Jingyuan Liu and Runze Li.

Local and Optimal Transport Perspectives on Uncertainty Quantification

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 22, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Dr. Amir SagivColumbia

In many scientific areas, deterministic models (e.g., differential equations) use numerical parameters. In real-world settings, however, such parameters might be uncertain or noisy. A more comprehensive model should therefore provide a statistical description of the quantity of interest. Underlying this computational problem is a fundamental question - if two "similar" functions push-forward the same measure, would the new resulting measures be close, and if so, in what sense? We will first show how the probability density function (PDF) of the quantity of interest can be approximated, using spectral and local methods. We will then discuss the limitations of PDF approximation, and present an alternative viewpoint: through optimal transport theory, a Wasserstein-distance formulation of our problem yields a much simpler and widely applicable theory.

Data Compression in Distributed Learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 15, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Ming YanMichigan State University

Large-scale machine learning models are trained by parallel (stochastic) gradient descent algorithms on distributed systems. The communications for gradient aggregation and model synchronization become the major obstacles for efficient learning as the number of nodes and the model's dimension scale up. In this talk, I will introduce several ways to compress the transferred data and reduce the overall communication such that the obstacles can be immensely mitigated. More specifically, I will introduce methods to reduce or eliminate the compression error without additional communication.

Generalization Bounds for Sparse Random Feature Expansions

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 8, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Giang TranUniversity of Waterloo

Random feature methods have been successful in various machine learning tasks, are easy to compute, and come with theoretical accuracy bounds. They serve as an alternative approach to standard neural networks since they can represent similar function spaces without a costly training phase. However, for accuracy, random feature methods require more measurements than trainable parameters, limiting their use for data-scarce applications or problems in scientific machine learning. This paper introduces the sparse random feature expansion to obtain parsimonious random feature models. Specifically, we leverage ideas from compressive sensing to generate random feature expansions with theoretical guarantees even in the data-scarce setting. We provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space) depending on the number of samples and the distribution of features. The generalization bounds improve with additional structural conditions, such as coordinate sparsity, compact clusters of the spectrum, or rapid spectral decay. We show that the sparse random feature expansions outperform shallow networks in several scientific machine learning tasks. Applications to signal decompositions for music data, astronomical data, and various complicated signals are also provided.

The Heavy-Tail Phenomenon in SGD

Series
Applied and Computational Mathematics Seminar
Time
Monday, October 18, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005 and https://bluejeans.com/457724603/4379
Speaker
Lingjiong ZhuFSU

Please Note: The speaker will be in person, but there will also be a remote option https://bluejeans.com/457724603/4379

In recent years, various notions of capacity and complexity have been proposed for characterizing the generalization properties of stochastic gradient descent (SGD) in deep learning. Some of the popular notions that correlate well with the performance on unseen data are (i) the flatness of the local minimum found by SGD, which is related to the eigenvalues of the Hessian, (ii) the ratio of the stepsize to the batch-size, which essentially controls the magnitude of the stochastic gradient noise, and (iii) the tail-index, which measures the heaviness of the tails of the network weights at convergence. In this paper, we argue that these three seemingly unrelated perspectives for generalization are deeply linked to each other. We claim that depending on the structure of the Hessian of the loss at the minimum, and the choices of the algorithm parameters, the distribution of the SGD iterates will converge to a heavy-tailed stationary distribution. We rigorously prove this claim in the setting of quadratic optimization: we show that even in a simple linear regression problem with independent and identically distributed data whose distribution has finite moments of all order, the iterates can be heavy-tailed with infinite variance. We further characterize the behavior of the tails with respect to algorithm parameters, the dimension, and the curvature. We then translate our results into insights about the behavior of SGD in deep learning. We support our theory with experiments conducted on synthetic data, fully connected, and convolutional neural networks. This is based on the joint work with Mert Gurbuzbalaban and Umut Simsekli.

High-Order Multirate Explicit Time-Stepping Schemes for the Baroclinic-Barotropic Split Dynamics in Primitive Equations

Series
Applied and Computational Mathematics Seminar
Time
Monday, October 4, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
online
Speaker
Lili JuUniversity of South Carolina

To treat the multiple time scales of ocean dynamics in an efficient manner, the baroclinic-barotropic splitting technique has been widely used for solving the primitive equations for ocean modeling. In this paper, we propose second and third-order multirate explicit time-stepping schemes for such split systems based on the strong stability-preserving Runge-Kutta (SSPRK) framework. Our method allows for a large time step to be used for advancing the three-dimensional (slow) baroclinic mode and a small time step for the two-dimensional (fast) barotropic mode, so that each of the two mode solves only need satisfy their respective CFL condition to maintain numerical stability. It is well known that the SSPRK method achieves high-order temporal accuracy by utilizing a convex combination of forward-Euler steps. At each time step of our method, the baroclinic velocity is first computed by using the SSPRK scheme to advance the baroclinic-barotropic system with the large time step, then the barotropic velocity is specially corrected by using the same SSPRK scheme with the small time step to advance the barotropic subsystem with a barotropic forcing interpolated based on values from the preceding baroclinic solves. Finally, the fluid thickness and the sea surface height perturbation is updated by coupling the predicted baroclinic and barotropic velocities. Two benchmark tests drawn from the MPAS-Ocean" platform are used to numerically demonstrate the accuracy and parallel performance of the proposed schemes.

The bluejeans link for the seminar is https://bluejeans.com/457724603/4379

Nonlinear model reduction for slow-fast stochastic systems near unknown invariant manifolds

Series
Applied and Computational Mathematics Seminar
Time
Monday, September 27, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Felix YeSUNY Albany

We introduce a nonlinear stochastic model reduction technique for high-dimensional stochastic dynamical systems that have a low-dimensional invariant effective manifold with slow dynamics, and high-dimensional, large fast modes. Given only access to a black box simulator from which short bursts of simulation can be obtained, we design an algorithm that outputs an estimate of the invariant manifold, a process of the effective stochastic dynamics on it, which has averaged out the fast modes, and a simulator thereof. This simulator is efficient in that it exploits of the low dimension of the invariant manifold, and takes time steps of size dependent on the regularity of the effective process, and therefore typically much larger than that of the original simulator, which had to resolve the fast modes. The algorithm and the estimation can be performed on-the-fly, leading to efficient exploration of the effective state space, without losing consistency with the underlying dynamics. This construction enables fast and efficient simulation of paths of the effective dynamics, together with estimation of crucial features and observables of such dynamics, including the stationary distribution, identification of metastable states, and residence times and transition rates between them.

Inference, Computation, and Games

Series
Applied and Computational Mathematics Seminar
Time
Monday, September 20, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005 and https://bluejeans.com/457724603/4379
Speaker
Florian SchaeferGT CSE

Please Note: Note the hybrid mode. The speaker will be in person in Skiles 005.

In this talk, we develop algorithms for numerical computation, based on ideas from competitive games and statistical inference.

In the first part, we propose competitive gradient descent (CGD) as a natural generalization of gradient descent to saddle point problems and general sum games. Whereas gradient descent minimizes a local linear approximation at each step, CGD uses the Nash equilibrium of a local bilinear approximation. Explicitly accounting for agent-interaction significantly improves the convergence properties, as demonstrated in applications to GANs, reinforcement learning, and computer graphics.

In the second part, we show that the conditional near-independence properties of smooth Gaussian processes imply the near-sparsity of Cholesky factors of their dense covariance matrices. We use this insight to derive simple, fast solvers with state-of-the-art complexity vs. accuracy guarantees for general elliptic differential- and integral equations. Our methods come with rigorous error estimates, are easy to parallelize, and show good performance in practice.