Quantitative convergence analysis of dynamical processes in machine learning

Dissertation Defense
Tuesday, June 25, 2024 - 10:30am for 2 hours
Skiles 006 and online
Yuqing Wang – Georgia Tech – ywang3398@gatech.eduhttps://ywang3398.math.gatech.edu
Yuqing Wang

Zoom link: https://gatech.zoom.us/j/6681416875?pwd=eEc2WEpxeUpCRUFiWXJUM2tPN1MvUT09

This talk focuses on analyzing the quantitative convergence of selected important machine learning processes, from a dynamical perspective, in order to understand and guide machine learning practices. More precisely, it consists of four parts: 1) I will illustrate the effect of large learning rates on optimization dynamics in a specific setup, which often correlates with improved generalization. 2) The theory from part 1 will be extended to a unified mechanism of several implicit biases in optimization, including edge of stability, balancing, and catapult. 3) I will concentrate on diffusion models, which is a concrete and important real-world application, and theoretically demonstrate how to choose its hyperparameters for good performance through the convergence analysis of the full generation process, including optimization and sampling. 4) The generalization performance of different architectures, namely deep residual networks (ResNets) and deep feedforward networks (FFNets), will be discussed.