Seminars and Colloquia by Series

On the Low-Complexity Critical Points of Two-Layer Neural Networks

Series
SIAM Student Seminar
Time
Friday, March 14, 2025 - 11:00 for
Location
Skiles 006
Speaker
Leyang ZhangGeorgia Tech

Abstract: Critical points significantly affect the behavior of gradient-based dynamics. Numerous works have been done for global minima of neural networks. Thus, the recent work characterizes non-global critical points. With the idea that gradient-based methods of neural networks favor “simple models”, this work focuses on the set of low-complexity critical points, i.e., those representing underparameterized network models. Specifically, we investigate: i) the existence and ii) geometry of such sets, iii) the output functions they represent, iv) saddles in them. The talk will discuss these results based on a simple example. The general theorems will also be included. No specific knowledge in neural networks is required. 

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

Series
SIAM Student Seminar
Time
Friday, February 14, 2025 - 11:00 for
Location
Skiles 006
Speaker
Yuchen ZhuGeorgia Tech

The generative modeling of data on manifolds is an important task, for which diffusion models in flat spaces typically need nontrivial adaptations. This article demonstrates how a technique called `trivialization' can transfer the effectiveness of diffusion models in Euclidean spaces to Lie groups. In particular, an auxiliary momentum variable was algorithmically introduced to help transport the position variable between data distribution and a fixed, easy-to-sample distribution. Normally, this would incur further difficulty for manifold data because momentum lives in a space that changes with the position. However, our trivialization technique creates a new momentum variable that stays in a simple fixed vector space. This design, together with a manifold preserving integrator, simplifies implementation and avoids inaccuracies created by approximations such as projections to tangent space and manifold, which were typically used in prior work, hence facilitating generation with high-fidelity and efficiency. The resulting method achieves state-of-the-art performance on protein and RNA torsion angle generation and sophisticated torus datasets. We also, arguably for the first time, tackle the generation of data on high-dimensional Special Orthogonal and Unitary groups, the latter essential for quantum problems.

Paper Reading: Unsupervised Solution Operator Learning for Mean-Field Games via Sampling-Invariant Parametrizations

Series
SIAM Student Seminar
Time
Friday, January 31, 2025 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Sebas Gut

Paper abstract: Recent advances in deep learning has witnessed many innovative frameworks that solve high dimensional mean-field games (MFG) accurately and efficiently. These methods, however, are restricted to solving single-instance MFG and demands extensive computational time per instance, limiting practicality. To overcome this, we develop a novel framework to learn the MFG solution operator. Our model takes a MFG instances as input and output their solutions with one forward pass. To ensure the proposed parametrization is well-suited for operator learning, we introduce and prove the notion of sampling invariance for our model, establishing its convergence to a continuous operator in the sampling limit. Our method features two key advantages. First, it is discretization-free, making it particularly suitable for learning operators of high-dimensional MFGs. Secondly, it can be trained without the need for access to supervised labels, significantly reducing the computational overhead associated with creating training datasets in existing operator learning methods. We test our framework on synthetic and realistic datasets with varying complexity and dimensionality to substantiate its robustness.

Link: https://arxiv.org/abs/2401.15482

Finding structure hidden inside chaotic negative feedback delay systems (with and without noise): Existence of invariant measures

Series
SIAM Student Seminar
Time
Friday, October 25, 2024 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 308
Speaker
Mark van den BoschLeiden University

In this talk, we present recent results regarding the existence of invariant probability measures for delay equations with (stochastic) negative feedback. No prior knowledge on invariant measures is assumed. Applications include Nicholson's blowflies equation and the Mackey-Glass equations. Just like the dynamics of prime numbers, these systems exhibit "randomness" combined with deep structure. We will prove this both analytically and numerically and focus mainly on intuition. In general, additive noise typically destroys all dynamical properties of the underlying dynamical system. Therefore, we are motivated to study a class of stochastic perturbations that preserve some of the dynamical properties of the negative feedback systems we consider.

Spectral Representation for Control and Reinforcement Learning

Series
SIAM Student Seminar
Time
Friday, September 13, 2024 - 11:15 for 1 hour (actually 50 minutes)
Location
Skiles 249
Speaker
Bo DaiGeorgia Tech

How to achieve the optimal control for general stochastic nonlinear is notoriously difficult, which becomes even more difficult by involving learning and exploration for unknown dynamics in reinforcement learning setting. In this talk, I will present our recent work on exploiting the power of representation in RL to bypass these difficulties. Specifically, we designed practical algorithms for extracting useful representations, with the goal of improving statistical and computational efficiency in exploration vs. exploitation tradeoff and empirical performance in RL. We provide rigorous theoretical analysis of our algorithm, and demonstrate the practical superior performance over the existing state-of-the-art empirical algorithms on several benchmarks. 

Paper Reading: Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models

Series
SIAM Student Seminar
Time
Thursday, August 29, 2024 - 10:00 for 1 hour (actually 50 minutes)
Location
Skiles 254
Speaker
Kevin RojasGeorgia Tech

Paper link: https://arxiv.org/abs/2405.03549

Abstract: Generative modeling via stochastic processes has led to remarkable empirical results as well as to recent advances in their theoretical understanding. In principle, both space and time of the processes can be discrete or continuous. In this work, we study time-continuous Markov jump processes on discrete state spaces and investigate their correspondence to state-continuous diffusion processes given by SDEs. In particular, we revisit the Ehrenfest process, which converges to an Ornstein-Uhlenbeck process in the infinite state space limit. Likewise, we can show that the time-reversal of the Ehrenfest process converges to the time-reversed Ornstein-Uhlenbeck process. This observation bridges discrete and continuous state spaces and allows to carry over methods from one to the respective other setting. Additionally, we suggest an algorithm for training the time-reversal of Markov jump processes which relies on conditional expectations and can thus be directly related to denoising score matching. We demonstrate our methods in multiple convincing numerical experiments.

 

Efficient hybrid spatial-temporal operator learning

Series
SIAM Student Seminar
Time
Friday, March 29, 2024 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Francesco BrardaEmory University

Recent advancements in operator-type neural networks, such as Fourier Neural Operator (FNO) and Deep Operator Network (DeepONet), have shown promising results in approximating the solutions of spatial-temporal Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new operator learning framework to address these issues. The proposed paradigm leverages the traditional wisdom from numerical PDE theory and techniques to refine the pipeline of existing operator neural networks. Specifically, the proposed architecture initiates the training for a single or a few epochs for the operator-type neural networks in consideration, concluding with the freezing of the model parameters. The latter are then fed into an error correction scheme: a single parametrized linear spectral layer trained with a convex loss function defined through a reliable functional-type a posteriori error estimator.This design allows the operator neural networks to effectively tackle low-frequency errors, while the added linear layer addresses high-frequency errors. Numerical experiments on a commonly used benchmark of 2D Navier-Stokes equations demonstrate improvements in both computational time and accuracy, compared to existing FNO variants and traditional numerical approaches.

Optimization in Data Science: Enhancing Autoencoders and Accelerating Federated Learning

Series
SIAM Student Seminar
Time
Monday, January 22, 2024 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Xue FengUC Davis

In this presentation, I will discuss my research in the field of data science, specifically in two areas: improving autoencoder interpolations and accelerating federated learning algorithms. My work combines advanced mathematical concepts with practical machine learning applications, contributing to both the theoretical and applied aspects of data science. The first part of my talk focuses on image sequence interpolation using autoencoders, which are essential tools in generative modeling. The focus is when there is only limited training data. By introducing a novel regularization term based on dynamic optimal transport to the loss function of autoencoder, my method can generate more robust and semantically coherent interpolation results. Additionally, the trained autoencoder can be used to generate barycenters. However, computation efficiency is a bottleneck of our method, and we are working on improving it. The second part of my presentation focuses on accelerating federated learning (FL) through the application of Anderson Acceleration. Our method achieves the same level of convergence performance as state-of-the-art second-order methods like GIANT by reweighting the local points and their gradients. However, our method only requires first-order information, making it a more practical and efficient choice for large-scale and complex training problems. Furthermore, our method is theoretically guaranteed to converge to the global minimizer with a linear rate.

Controlled SPDEs: Peng’s Maximum Principle and Numerical Methods

Series
SIAM Student Seminar
Time
Friday, November 17, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Lukas WesselsGeorgia Tech

In this talk, we consider a finite-horizon optimal control problem of stochastic reaction-diffusion equations. First, we apply the spike variation method which relies on introducing the first and second order adjoint state. We give a novel characterization of the second order adjoint state as the solution to a backward SPDE. Using this representation, we prove the maximum principle for controlled SPDEs.

In the second part, we present a numerical algorithm that allows the efficient approximation of optimal controls in the case of stochastic reaction-diffusion equations with additive noise by first reducing the problem to controls of feedback form and then approximating the feedback function using finitely based approximations. Numerical experiments using artificial neural networks as well as radial basis function networks illustrate the performance of our algorithm.

This talk is based on joint work with Wilhelm Stannat and Alexander Vogler. Talk will also be streamed: https://gatech.zoom.us/j/93808617657?pwd=ME44NWUxbk1NRkhUMzRsK3c0ZGtvQT09

Neural-ODE for PDE Solution Operators

Series
SIAM Student Seminar
Time
Friday, September 29, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Nathan GabyGeorgia State University

We consider a numerical method to approximate the solution operator for evolutional partial differential equations (PDEs). By employing a general reduced-order model, such as a deep neural network, we connect the evolution of a model's parameters with trajectories in a corresponding function space. Using the Neural Ordinary Differential Equations (NODE) technique we learn a vector field over the parameter space such that from any initial starting point, the resulting trajectory solves the evolutional PDE. Numerical results are presented for a number of high-dimensional problems where traditional methods fail due to the curse of dimensionality.

Pages