Seminars and Colloquia by Series

How Differential Equations Insight Benefit Deep Learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, March 28, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://gatech.zoom.us/j/96551543941 (note: Zoom, not Bluejeans)
Speaker
Prof. Bao WangUniversity of Utah

We will present a new class of continuous-depth deep neural networks that were motivated by the ODE limit of the classical momentum method, named heavy-ball neural ODEs (HBNODEs). HBNODEs enjoy two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly accelerate learning and improve the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data.

Second, we will extend HBNODE to graph learning leveraging diffusion on graphs, resulting in new algorithms for deep graph learning. The new algorithms are more accurate than existing deep graph learning algorithms and more scalable to deep architectures, and also suitable for learning at low labeling rate regimes. Moreover, we will present a fast multipole method-based efficient attention mechanism for modeling graph nodes interactions.

Third, if time permits, we will discuss proximal algorithms for accelerating learning continuous-depth neural networks.

Low-dimensional Modeling for Deep Learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, March 14, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://gatech.zoom.us/j/96551543941
Speaker
Zhihui ZhuUniversity of Denvor

In the past decade, the revival of deep neural networks has led to dramatic success in numerous applications ranging from computer vision to natural language processing to scientific discovery and beyond. Nevertheless, the practice of deep networks has been shrouded with mystery as our theoretical understanding of the success of deep learning remains elusive.

In this talk, we will exploit low-dimensional modeling to help understand and improve deep learning performance. We will first provide a geometric analysis for understanding neural collapse, an intriguing empirical phenomenon that persists across different neural network architectures and a variety of standard datasets. We will utilize our understanding of neural collapse to improve training efficiency. We will then exploit principled methods for dealing with sparsity and sparse corruptions to address the challenges of overfitting for modern deep networks in the presence of training data corruptions. We will introduce a principled approach for robustly training deep networks with noisy labels and robustly recovering natural images by deep image prior.

Symmetry-preserving machine learning for computer vision, scientific computing, and distribution learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, March 7, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://gatech.zoom.us/j/96551543941 (note: Zoom, not Bluejeans)
Speaker
Prof. Wei ZhuUMass Amherst

Please Note: Note the talk will be hosted by Zoom, not Bluejeans any more.

Symmetry is ubiquitous in machine learning and scientific computing. Robust incorporation of symmetry prior into the learning process has shown to achieve significant model improvement for various learning tasks, especially in the small data regime.

In the first part of the talk, I will explain a principled framework of deformation-robust symmetry-preserving machine learning. The key idea is the spectral regularization of the (group) convolutional filters, which ensures that symmetry is robustly preserved in the model even if the symmetry transformation is “contaminated” by nuisance data deformation.
 
In the second part of the talk, I will demonstrate how to incorporate additional structural information (such as group symmetry) into generative adversarial networks (GANs) for data-efficient distribution learning. This is accomplished by developing new variational representations for divergences between probability measures with embedded structures. We study, both theoretically and empirically, the effect of structural priors in the two GAN players. The resulting structure-preserving GAN is able to achieve significantly improved sample fidelity and diversity—almost an order of magnitude measured in Fréchet Inception Distance—especially in the limited data regime. 
 

Neural Networks with Inputs Based on Domain of Dependence and A Converging Sequence for Solving Conservation Laws

Series
Applied and Computational Mathematics Seminar
Time
Monday, February 28, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Haoxiang HuangGT

Recent research on solving partial differential equations with deep neural networks (DNNs) has demonstrated that spatiotemporal-function approximators defined by auto-differentiation are effective    for approximating nonlinear problems. However, it remains a challenge to resolve discontinuities in nonlinear conservation laws using forward methods with DNNs without beginning with part of the solution. In this study, we incorporate first-order numerical schemes into DNNs to set up the loss function approximator instead of auto-differentiation from traditional deep learning framework such as the TensorFlow package, thereby improving the effectiveness of capturing discontinuities in Riemann problems. We introduce a novel neural network method.  A local low-cost solution is first used as the input of a neural network to predict the high-fidelity solution at a space-time location. The challenge lies in the fact that there is no way to distinguish a smeared discontinuity from a steep smooth solution in the input, thus resulting in “multiple predictions” of the neural network. To overcome the difficulty, two solutions of the conservation laws from a converging sequence, computed from low-cost numerical schemes, and in a local domain of dependence of the space-time location, serve as the input. Despite smeared input solutions, the output provides sharp approximations to solutions containing shocks and contact surfaces, and the method is efficient to use, once trained. It works not only for discontinuities, but also for smooth areas of the solution, implying broader applications for other differential equations.

Coarse – Graining of stochastic system

Series
Applied and Computational Mathematics Seminar
Time
Monday, February 7, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Prof. Xingjie "Helen" LiUNC Charlotte


Efficient simulation of SDEs is essential in many applications, particularly for ergodic
systems that demand efficient simulation of both short-time dynamics and large-time
statistics. To achieve the efficiency, dimension reduction is often required in both space
and time. In this talk, I will talk about our recent work on both spatial and temporal
reductions.
For spatial dimension reduction, the Mori-Zwanzig formalism is applied to derive
equations for the evolution of linear observables of the Langevin dynamics for both
overdamped and general cases.
For temporal dimension reduction, we introduce a framework to construct inference-
based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in
time by several orders of magnitudes.
This is a joint work with Dr. Thomas Hudson from the University of Warwick, UK; Dr. Fei
Lu from the Johns Hopkins University and Dr Xiaofeng Felix Ye from SUNY at Albany.

How to Break the Curse of Dimensionality

Series
Applied and Computational Mathematics Seminar
Time
Monday, January 31, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Ming-Jun LaiUniversity of Georgia

We first review the problem of the curse of dimensionality when approximating multi-dimensional functions. Several approximation results from Barron, Petrushev,  Bach, and etc . will be explained. 

Then we present two approaches to break the curse of the dimensionality: one is based on probability approach explained in Barron, 1993 and the other one is based on a deterministic approach using the Kolmogorov superposition theorem.   As the Kolmogorov superposition theorem has been used to explain the approximation of neural network computation, I will use it to explain why the deep learning algorithm works for image classification.
In addition, I will introduce the neural network approximation based on higher order ReLU functions to explain the powerful approximation of multivariate functions using  deep learning algorithms with  multiple layers.

Non-Parametric Estimation of Manifolds from Noisy Data

Series
Applied and Computational Mathematics Seminar
Time
Monday, December 6, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Yariv AizenbudYale University
A common task in many data-driven applications is to find a low dimensional manifold that describes the data accurately. Estimating a manifold from noisy samples has proven to be a challenging task. Indeed, even after decades of research, there is no (computationally tractable) algorithm that accurately estimates a manifold from noisy samples with a constant level of noise.
 
In this talk, we will present a method that estimates a manifold and its tangent in the ambient space. Moreover, we establish rigorous convergence rates, which are essentially as good as existing convergence rates for function estimation.

Model-free Feature Screening and FDR Control with Knockoff Features

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 29, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Yuan KeUniversity of Georgia

This paper proposes a model-free and data-adaptive feature screening method for ultra-high dimensional data. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model, and applies to data in the presence of heavy tails and multivariate responses. It enjoys both sure screening and rank consistency properties under weak assumptions.  A two-step approach, with the help of knockoff features, is advocated to specify the threshold for feature screening  such that the false discovery rate (FDR) is controlled under a pre-specified level. The proposed two-step approach enjoys both sure screening and FDR control simultaneously if the pre-specified FDR level is greater or equal to 1/s, where s is the number of active features.  The superior empirical performance of the proposed method is illustrated by simulation examples and real data applications. This is a joint work with Wanjun Liu, Jingyuan Liu and Runze Li.

Local and Optimal Transport Perspectives on Uncertainty Quantification

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 22, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Dr. Amir SagivColumbia

Please Note: remote

In many scientific areas, deterministic models (e.g., differential equations) use numerical parameters. In real-world settings, however, such parameters might be uncertain or noisy. A more comprehensive model should therefore provide a statistical description of the quantity of interest. Underlying this computational problem is a fundamental question - if two "similar" functions push-forward the same measure, would the new resulting measures be close, and if so, in what sense? We will first show how the probability density function (PDF) of the quantity of interest can be approximated, using spectral and local methods. We will then discuss the limitations of PDF approximation, and present an alternative viewpoint: through optimal transport theory, a Wasserstein-distance formulation of our problem yields a much simpler and widely applicable theory.
 

Data Compression in Distributed Learning

Series
Applied and Computational Mathematics Seminar
Time
Monday, November 15, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/457724603/4379
Speaker
Ming YanMichigan State University

Large-scale machine learning models are trained by parallel (stochastic) gradient descent algorithms on distributed systems. The communications for gradient aggregation and model synchronization become the major obstacles for efficient learning as the number of nodes and the model's dimension scale up. In this talk, I will introduce several ways to compress the transferred data and reduce the overall communication such that the obstacles can be immensely mitigated. More specifically, I will introduce methods to reduce or eliminate the compression error without additional communication.

Pages