## Seminars and Colloquia by Series

### Fast and optimal algorithm for online portfolios, and beyond

Series
Job Candidate Talk
Time
Thursday, March 9, 2023 - 15:30 for 1 hour (actually 50 minutes)
Location
Skiles 006 and Online via https://gatech.zoom.us/j/98280978183
Speaker
Dmitrii OstrovskiiUSC

In his seminal 1991 paper, Thomas M. Cover introduced a simple and elegant mathematical model for trading on the stock market. This model, which later on came to be known as  online portfolio selection (OPS), is specified with only two integer parameters: the number of assets $d$ and time horizon $T$. In each round $t \in \{1, ..., T\}$, the trader selects a  portfolio--distribution $p_t \in R^d_+$ of the current capital over the set of $d$ assets; after this, the adversary generates a nonnegative vector $r_t \in R^d_+$ of returns (relative prices of assets), and the trader's capital is multiplied by the "aggregated return'' $\langle p_{t}, r_{t} \rangle$. Despite its apparent simplicity, this model captures the two key properties of the stock market: (i) it "plays against'' the trader; (ii) money accumulates multiplicatively. In the 30 years that followed, the OPS model has received a great deal of attention from the learning theory, information theory, and quantitative finance communities.

In the same paper, Cover also proposed an algorithm, termed Universal Portfolios, that admitted a strong performance guarantee: the regret of $O(d \log (T))$ against the best portfolio in hindsight, and without any restrictions of returns or portfolios. This guarantee was later on shown to be worst-case optimal, and no other algorithm attaining it has been found to date. Unfortunately, exact computation of a universal portfolio amounts to averaging over a log-concave distribution, which is a challenging task. Addressing this, Kalai and Vempala (2002) achieved the running time of $O(d^4 T^{14})$ per round via log-concave sampling techniques. However, with such a running time essentially prohibiting all but "toy'' problems--yet remaining state-of-the-art--the problem of finding an optimal and practical OPS algorithm was left open.

In this talk, after discussing some of the arising challenges, I shall present a fast and optimal OPS algorithm proposed in a recent work with R. Jezequel and P. Gaillard (arXiv:2209.13932). Our algorithm combines regret optimality with the runtime of $O(d^2 T)$, thus dramatically improving state of the art. As we shall see, the motivation and analysis of the proposed algorithm are closely related to establishing a sharp bound on the accuracy of the Laplace approximation for a log-concave distribution with a polyhedral support, which is a result of independent interest.

Zoom link to the talk: https://gatech.zoom.us/j/98280978183

### Implicit bias of optimization algorithms and generalization of over-parameterized neural networks

Series
Job Candidate Talk
Time
Monday, February 6, 2023 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005, and https://gatech.zoom.us/j/98355006347
Speaker
Chao MaStanford University

Please Note: Speaker will be in person, but also livestreamed but not recorded at https://gatech.zoom.us/j/98355006347

Modern neural networks are usually over-parameterized—the number of parameters exceeds the number of training data. In this case the loss function tends to have many (or even infinite) global minima, which imposes a challenge of minima selection on optimization algorithms besides the convergence. Specifically, when training a neural network, the algorithm not only has to find a global minimum, but also needs to select minima with good generalization among many others. We study the mechanisms that facilitate global minima selection of optimization algorithms, as well as its connection with good generalization performance. First, with a linear stability theory, we show that stochastic gradient descent (SGD) favors global minima with flat and uniform landscape. Then, we build a theoretical connection of flatness and generalization performance based on a special multiplicative structure of neural networks. Connecting the two results, we develop generalization bounds for neural networks trained by SGD. Our bounds take the optimization process into consideration. Furthermore, we study the behavior of optimization algorithms around manifold of minima and reveal the exploration of algorithms from one minimum to another.

### Continuous combinatorics and natural quasirandomness

Series
Job Candidate Talk
Time
Wednesday, February 1, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker

The theory of graph quasirandomness studies graphs that "look like" samples of the Erdős--Rényi
random graph $G_{n,p}$. The upshot of the theory is that several ways of comparing a sequence with
the random graph turn out to be equivalent. For example, two equivalent characterizations of
quasirandom graph sequences is as those that are uniquely colorable or uniquely orderable, that is,
all colorings (orderings, respectively) of the graphs "look approximately the same". Since then,
generalizations of the theory of quasirandomness have been obtained in an ad hoc way for several
different combinatorial objects, such as digraphs, tournaments, hypergraphs, permutations, etc.

The theory of graph quasirandomness was one of the main motivations for the development of the
theory of limits of graph sequences, graphons. Similarly to quasirandomness, generalizations of
graphons were obtained in an ad hoc way for several combinatorial objects. However, differently from
quasirandomness, for the theory of limits of combinatorial objects (continuous combinatorics), the
theories of flag algebras and theons developed limits of arbitrary combinatorial objects in a
uniform and general framework.

In this talk, I will present the theory of natural quasirandomness, which provides a uniform and
general treatment of quasirandomness in the same setting as continuous combinatorics. The talk will
focus on the first main result of natural quasirandomness: the equivalence of unique colorability
and unique orderability for arbitrary combinatorial objects. Although the theory heavily uses the
language and techniques of continuous combinatorics from both flag algebras and theons, no
familiarity with the topic is required as I will also briefly cover all definitions and theorems
necessary.

This talk is based on joint work with Alexander A. Razborov.

### Towards a theory of complexity of sampling, inspired by optimization

Series
Job Candidate Talk
Time
Monday, January 30, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006 and https://gatech.zoom.us/j/91578357941?pwd=QS9malIvMVJqaWhpT0xqdWtxMCs1QT09
Speaker
Sinho ChewiMIT

Sampling is a fundamental and widespread algorithmic primitive that lies at the heart of Bayesian inference and scientific computing, among other disciplines. Recent years have seen a flood of works aimed at laying down the theoretical underpinnings of sampling, in analogy to the fruitful and widely used theory of convex optimization. In this talk, I will discuss some of my work in this area, focusing on new convergence guarantees obtained via a proximal algorithm for sampling, as well as a new framework for studying the complexity of non-log-concave sampling.

### Lipschitz mass transport

Series
Job Candidate Talk
Time
Thursday, January 26, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Dan MikulincerDepartment of mathematics, MIT

A central question in the field of optimal transport studies optimization problems involving two measures on a common metric space, a source and a target. The goal is to find a mapping from the source to the target, in a way that minimizes distances. A remarkable fact discovered by Caffarelli is that, in some specific cases of interest, the optimal transport maps on a Euclidean metric space are LipschitzLipschitz regularity is a desirable property because it allows for the transfer of analytic properties between measures. This perspective has proven to be widely influential, with applications extending beyond the field of optimal transport.

In this talk, we will further explore the Lipschitz properties of transport maps. Our main observation is that, when one seeks Lipschitz mappings, the optimality conditions mentioned above do not play a major role. Instead of minimizing distances, we will consider a general construction of transport maps based on interpolation of measures, and introduce a set of techniques to analyze the Lipschitz constant of this construction. In particular, we will go beyond the Euclidean setting and consider Riemannian manifolds as well as infinite-dimensional spaces.

Some applications, such as functional inequalities, normal approximations, and generative diffusion models will also be discussed.

### Effective equations for large systems of particles or waves

Series
Job Candidate Talk
Time
Monday, January 23, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles Room 006
Speaker
Ioakeim AmpatzoglouNYU Courant Institute

Understanding the behavior of large physical systems is a problem of fundamental importance in mathematical physics. Analysis of systems of many interacting particles is key for understanding various phenomena from physical sciences (e.g. gases in non-equilibrium, galactic dynamics) to social sciences (e.g. modeling social networks). Similarly, the description of systems of weakly nonlinear interacting waves, referred to as wave turbulence theory, finds a wide range of applications from solid state physics and water waves to plasma theory and oceanography. However, with the size of the system of interest being extremely large, deterministic prediction of its behavior is practically impossible, and one resorts to an averaging description. In this talk, we will discuss about kinetic theory, which is a mesoscopic framework to study the qualitative properties of large systems. As we will see, the main idea behind kinetic theory is that, in order to identify averaging quantities of large systems, one studies their asymptotic behavior as the size of the system tends to infinity, with the hope that the limiting effective equation will reveal properties observed in a system of large, but finite size. We will focus on the Boltzmann equation, which is the effective equation for systems of interacting particles, and its higher order extensions, as well as the kinetic wave equation which describes systems of many nonlinearly interacting waves.

### Complexity and asymptotics in Algebraic Combinatorics

Series
Job Candidate Talk
Time
Thursday, January 19, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006 and Zoom
Speaker
Greta PanovaUniversity of Southern California

Please Note: Refreshments available from 10:30 in Skiles Atrium. Talk will be streamed via https://gatech.zoom.us/j/94839708119?pwd=bmE1WXFTTzdFVDBtYzlvWUc3clFlZz09 but not recorded.

Algebraic Combinatorics originated in Algebra and Representation Theory, yet its objects and methods turned out applicable to other fields from Probability to Computer Science. Its flagship hook-length formula for the number of Standard Young Tableaux, or the beautiful Littlewood-Richardson rule have inspired large areas of study and development. We will see what lies beyond the reach of such nice product formulas and combinatorial interpretations and enter the realm of Computational Complexity and Asymptotics. We will also show how an 80 year old open problem on Kronecker coefficients of the Symmetric group lead to the disprove of the wishful approach towards the resolution of the algebraic P vs NP Millennium problem.

### Bias in cubic Gauss sums: Patterson's conjecture

Series
Job Candidate Talk
Time
Wednesday, January 18, 2023 - 11:05 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Alexander DunnCaltech

Large sieve inequalities are a fundamental tool used to investigate prime numbers and exponential sums. I will explain my work that resolves a 1978 conjecture of S. Patterson (conditional on the Generalized Riemann hypothesis) concerning the bias of cubic Gauss sums. This explains a well-known numerical bias first observed by Kummer in 1846. One important byproduct of my work is a proof that

Heath-Brown's famous cubic large sieve is sharp, contrary to popular belief.  This sheds light on some of the mysteries surrounding large sieve inequalities for certain families of arithmetic harmonics and gives strong clues on where to look next for further progress. This is based on joint work with Maksym Radziwill.

### Randomness in Ramsey theory and coding theory

Series
Job Candidate Talk
Time
Tuesday, January 17, 2023 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Xiaoyu HePrinceton University

Two of the most influential theorems in discrete mathematics state, respectively, that diagonal Ramsey numbers grow exponentially and that error-correcting codes for noisy channels exist up to the information limit. The former, proved by Erdős in 1947 using random graphs, led to the development of the probabilistic method in combinatorics. The latter, proved by Shannon in 1948 using random codes, is one of the founding results of coding theory. Since then, the probabilistic method has been a cornerstone in the development of both Ramsey theory and coding theory. In this talk, we highlight a few important applications of the probabilistic method in these two parallel but interconnected worlds. We then present new results on Ramsey numbers of graphs and hypergraphs and codes correcting deletion errors, all based on probabilistic ideas.

### Stochastic partial differential equations in supercritical, subcritical, and critical dimensions

Series
Job Candidate Talk
Time
Friday, January 13, 2023 - 11:05 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Alexander DunlapCourant Institute, NYU

A pervading question in the study of stochastic PDE is how small-scale random forcing in an equation combines to create nontrivial statistical behavior on large spatial and temporal scales. I will discuss recent progress on this topic for several related stochastic PDEs - stochastic heat, KPZ, and Burgers equations - and some of their generalizations. These equations are (conjecturally) universal models of physical processes such as a polymer in a random environment, the growth of a random interface, branching Brownian motion, and the voter model. The large-scale behavior of solutions on large scales is complex, and in particular depends qualitatively on the dimension of the space. I will describe the phenomenology, and then describe several results and challenging problems on invariant measures, growth exponents, and limiting distributions.