Quantitative convergence analysis of dynamical processes in machine learning
- Series
- Dissertation Defense
- Time
- Tuesday, June 25, 2024 - 10:30 for 2 hours
- Location
- Skiles 006 and online
- Speaker
- Yuqing Wang – Georgia Tech – ywang3398@gatech.edu
Zoom link: https://gatech.zoom.us/j/6681416875?pwd=eEc2WEpxeUpCRUFiWXJUM2tPN1MvUT09
This talk focuses on analyzing the quantitative convergence of selected important machine learning processes, from a dynamical perspective, in order to understand and guide machine learning practices. More precisely, it consists of four parts: 1) I will illustrate the effect of large learning rates on optimization dynamics in a specific setup, which often correlates with improved generalization. 2) The theory from part 1 will be extended to a unified mechanism of several implicit biases in optimization, including edge of stability, balancing, and catapult. 3) I will concentrate on diffusion models, which is a concrete and important real-world application, and theoretically demonstrate how to choose its hyperparameters for good performance through the convergence analysis of the full generation process, including optimization and sampling. 4) The generalization performance of different architectures, namely deep residual networks (ResNets) and deep feedforward networks (FFNets), will be discussed.