Seminars and Colloquia by Series

Generalization Bounds for Sparse Random Feature Expansions

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 8, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Giang TranUniversity of Waterloo

Random feature methods have been successful in various machine learning tasks, are easy to compute, and come with theoretical accuracy bounds. They serve as an alternative approach to standard neural networks since they can represent similar function spaces without a costly training phase. However, for accuracy, random feature methods require more measurements than trainable parameters, limiting their use for data-scarce applications or problems in scientific machine learning. This paper introduces the sparse random feature expansion to obtain parsimonious random feature models. Specifically, we leverage ideas from compressive sensing to generate random feature expansions with theoretical guarantees even in the data-scarce setting. We provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space) depending on the number of samples and the distribution of features. The generalization bounds improve with additional structural conditions, such as coordinate sparsity, compact clusters of the spectrum, or rapid spectral decay. We show that the sparse random feature expansions outperform shallow networks in several scientific machine learning tasks. Applications to signal decompositions for music data, astronomical data, and various complicated signals are also provided.

The Heavy-Tail Phenomenon in SGD

Series
Applied and Computational Mathematics Seminar
Time
Monday, October 18, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005 and https://bluejeans.com/457724603/4379
Speaker
Lingjiong ZhuFSU

Please Note: The speaker will be in person, but there will also be a remote option https://bluejeans.com/457724603/4379

In recent years, various notions of capacity and complexity have been proposed for characterizing the generalization properties of stochastic gradient descent (SGD) in deep learning. Some of the popular notions that correlate well with the performance on unseen data are (i) the flatness of the local minimum found by SGD, which is related to the eigenvalues of the Hessian, (ii) the ratio of the stepsize to the batch-size, which essentially controls the magnitude of the stochastic gradient noise, and (iii) the tail-index, which measures the heaviness of the tails of the network weights at convergence. In this paper, we argue that these three seemingly unrelated perspectives for generalization are deeply linked to each other. We claim that depending on the structure of the Hessian of the loss at the minimum, and the choices of the algorithm parameters, the distribution of the SGD iterates will converge to a heavy-tailed stationary distribution. We rigorously prove this claim in the setting of quadratic optimization: we show that even in a simple linear regression problem with independent and identically distributed data whose distribution has finite moments of all order, the iterates can be heavy-tailed with infinite variance. We further characterize the behavior of the tails with respect to algorithm parameters, the dimension, and the curvature. We then translate our results into insights about the behavior of SGD in deep learning. We support our theory with experiments conducted on synthetic data, fully connected, and convolutional neural networks. This is based on the joint work with Mert Gurbuzbalaban and Umut Simsekli.

High-Order Multirate Explicit Time-Stepping Schemes for the Baroclinic-Barotropic Split Dynamics in Primitive Equations

Series
Applied and Computational Mathematics Seminar
Time
Monday, October 4, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
online
Speaker
Lili JuUniversity of South Carolina

To treat the multiple time scales of ocean dynamics in an efficient manner, the baroclinic-barotropic splitting technique has been widely used for solving the primitive equations for ocean modeling. In this paper, we propose second and third-order multirate explicit time-stepping schemes for such split systems based on the strong stability-preserving Runge-Kutta (SSPRK) framework. Our method allows for a large time step to be used for advancing the three-dimensional (slow) baroclinic mode and a small time step for the two-dimensional (fast) barotropic mode, so that each of the two mode solves only need satisfy their respective CFL condition to maintain numerical stability. It is well known that the SSPRK method achieves high-order temporal accuracy by utilizing a convex combination of forward-Euler steps. At each time step of our method, the baroclinic velocity is first computed by using the SSPRK scheme to advance the baroclinic-barotropic system with the large time step, then the barotropic velocity is specially corrected by using the same SSPRK scheme with the small time step to advance the barotropic subsystem with a barotropic forcing interpolated based on values from the preceding baroclinic solves. Finally, the fluid thickness and the sea surface height perturbation is updated by coupling the predicted baroclinic and barotropic velocities. Two benchmark tests drawn from the ``MPAS-Ocean" platform are used to numerically demonstrate the accuracy and parallel performance of the proposed schemes.

 

The bluejeans link for the seminar is https://bluejeans.com/457724603/4379

Nonlinear model reduction for slow-fast stochastic systems near unknown invariant manifolds

Series
Applied and Computational Mathematics Seminar
Time
Monday, September 27, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Felix YeSUNY Albany

We introduce a nonlinear stochastic model reduction technique for high-dimensional stochastic dynamical systems that have a low-dimensional invariant effective manifold with slow dynamics, and high-dimensional, large fast modes. Given only access to a black box simulator from which short bursts of simulation can be obtained, we design an algorithm that outputs an estimate of the invariant manifold, a process of the effective stochastic dynamics on it, which has averaged out the fast modes, and a simulator thereof. This simulator is efficient in that it exploits of the low dimension of the invariant manifold, and takes time steps of size dependent on the regularity of the effective process, and therefore typically much larger than that of the original simulator, which had to resolve the fast modes. The algorithm and the estimation can be performed on-the-fly, leading to efficient exploration of the effective state space, without losing consistency with the underlying dynamics. This construction enables fast and efficient simulation of paths of the effective dynamics, together with estimation of crucial features and observables of such dynamics, including the stationary distribution, identification of metastable states, and residence times and transition rates between them. 

Inference, Computation, and Games

Series
Applied and Computational Mathematics Seminar
Time
Monday, September 20, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005 and https://bluejeans.com/457724603/4379
Speaker
Florian SchaeferGT CSE

Please Note: Note the hybrid mode. The speaker will be in person in Skiles 005.

In this talk, we develop algorithms for numerical computation, based on ideas from competitive games and statistical inference. 

 

In the first part, we propose competitive gradient descent (CGD) as a natural generalization of gradient descent to saddle point problems and general sum games. Whereas gradient descent minimizes a local linear approximation at each step, CGD uses the Nash equilibrium of a local bilinear approximation. Explicitly accounting for agent-interaction significantly improves the convergence properties, as demonstrated in applications to GANs, reinforcement learning, and computer graphics.

 

In the second part, we show that the conditional near-independence properties of smooth Gaussian processes imply the near-sparsity of Cholesky factors of their dense covariance matrices. We use this insight to derive simple, fast solvers with state-of-the-art complexity vs. accuracy guarantees for general elliptic differential- and integral equations. Our methods come with rigorous error estimates, are easy to parallelize, and show good performance in practice.

Incorporating Symmetry for Improved Deep Dynamics Learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, September 13, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Prof. Rose YuUCSD

While deep learning has been used for dynamics learning, limited physical accuracy and an inability to generalize under distributional shift limit its applicability to real world. In this talk, I will demonstrate how to incorporate symmetries into deep neural networks and significantly improve the physical consistency, sample efficiency, and generalization in learning dynamics. I will showcase the applications of these models to challenging problems such as turbulence forecasting and trajectory prediction for autonomous vehicles.

Some problems in point-set registration

Series
Applied and Computational Mathematics Seminar
Time
Monday, April 26, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/884917410
Speaker
Yuehaw KhooUniversity of Chicago

In this talk, we discuss variants of the rigid registration problem, i.e aligning objects via rigid transformation. In the simplest scenario of point-set registration where the correspondence between points are known, we investigate the robustness of registration to outliers. We also study a convex programming formulation of point-set registration with exact recovery, in the situation where both the correspondence and alignment are unknown. This talk is based on joint works with Ankur Kapoor, Cindy Orozco, and Lexing Ying.
 

Symmetrically processed splitting integrators for enhanced Hamiltonian Monte Carlo sampling

Series
Applied and Computational Mathematics Seminar
Time
Monday, April 19, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
ONLINE https://bluejeans.com/884917410
Speaker
Prof. Sergio BlanesUniversidad Politécnica de Valencia

We construct integrators to be used in Hamiltonian (or Hybrid) Monte Carlo sampling. The new integrators are easily implementable and, for a given computational budget, may deliver five times as many accepted proposals as standard leapfrog/Verlet without impairing in any way the quality of the samples. They are based on a suitable modification of the   processing technique first introduced by J.C. Butcher. The idea of modified processing may also be useful for other purposes, like the construction of high-order splitting integrators with positive coefficients.

Joint work with Mari Paz Calvo, Fernando Casas, and Jesús M. Sanz-Serna

Optimization in the space of probabilities with MCMC: Uncertainty quantification and sequential decision making

Series
Applied and Computational Mathematics Seminar
Time
Monday, April 5, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
ONLINE https://bluejeans.com/884917410
Speaker
Prof. Yian MaUCSD

I will present MCMC algorithms as optimization over the KL-divergence in the space of probabilities. By incorporating a momentum variable, I will discuss an algorithm which performs accelerated gradient descent over the KL-divergence. Using optimization-like ideas, a suitable Lyapunov function is constructed to prove that an accelerated convergence rate is obtained. I will then discuss how MCMC algorithms compare against variational inference methods in parameterizing the gradient flows in the space of probabilities and how it applies to sequential decision making problems.

Incorporating Invariance to Reduce the Complexity of Parametric Models

Series
Applied and Computational Mathematics Seminar
Time
Monday, March 29, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/884917410
Speaker
Alex CloningerUniversity of California, San Diego

Many scientific problems involve invariant structures, and learning functions that rely on a much lower dimensional set of features than the data itself.   Incorporating these invariances into a parametric model can significantly reduce the model complexity, and lead to a vast reduction in the number of labeled examples required to estimate the parameters.  We display this benefit in two settings.  The first setting concerns ReLU networks, and the size of networks and number of points required to learn certain functions and classification regions.  Here, we assume that the target function has built in invariances, namely that it only depends on the projection onto a very low dimensional, function defined manifold (with dimension possibly significantly smaller than even the intrinsic dimension of the data).  We use this manifold variant of a single or multi index model to establish network complexity and ERM rates that beat even the intrinsic dimension of the data.  We should note that a corollary of this result is developing intrinsic rates for a manifold plus noise data model without needing to assume the distribution of the noise decays exponentially, and we also discuss implications in two-sample testing and statistical distances.  The second setting for building invariances concerns linearized optimal transport (LOT), and using it to build supervised classifiers on distributions.  Here, we construct invariances to families of group actions (e.g., shifts and scalings of a fixed distribution), and show that LOT can learn a classifier on group orbits using a simple linear separator.   We demonstrate the benefit of this on MNIST by constructing robust classifiers with only a small number of labeled examples.  This talk covers joint work with Timo Klock, Xiuyuan Cheng, and Caroline Moosmueller.
 

Pages