## Seminars and Colloquia by Series

### Approximate Kernel Principal Component Analysis: Computational vs. Statistical Trade-off

Series
Stochastics Seminar
Time
Thursday, October 8, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://gatech.webex.com/gatech/j.php?MTID=mdd4512d3d11623149a0bd46d9fc086c8
Speaker
Bharath SriperumbudurPennsylvania State University

Kernel principal component analysis (KPCA) is a popular non-linear dimensionality reduction technique, which generalizes classical linear PCA by finding functions in a reproducing kernel Hilbert space (RKHS) such that the function evaluation at a random variable $X$ has a maximum variance. Despite its popularity, kernel PCA suffers from poor scalability in big data scenarios as it involves solving a $n \times n$ eigensystem leading to the computational complexity of $O(n^3)$ with $n$ being the number of samples. To address this issue, in this work, we consider a random feature approximation to kernel PCA which requires solving an $m \times m$ eigenvalue problem and therefore has a computational complexity of $O(m^3+nm^2)$, implying that the approximate method is computationally efficient if $m$ < $n$ with $m$ being the number of random features. The goal of this work is to investigate the trade-off between computational and statistical behaviors of approximate KPCA, i.e., whether the computational gain is achieved at the cost of statistical efficiency. We show that the approximate KPCA is both computationally and statistically efficient compared to KPCA in terms of the error associated with reconstructing a kernel function based on its projection onto the corresponding eigenspaces.

Link to Cisco Webex meeting: https://gatech.webex.com/gatech/j.php?MTID=mdd4512d3d11623149a0bd46d9fc086c8

### Topology of cable knots

Series
Geometry Topology Student Seminar
Time
Wednesday, October 7, 2020 - 14:00 for 1 hour (actually 50 minutes)
Location
Speaker
Hyunki MinGeorgia Tech

Cabling is one of important knot operations. We study various properties of cable knots and how to characterize the cable knots by its complement.

### Inducibility of graphs and tournaments

Series
Graph Theory Seminar
Time
Tuesday, October 6, 2020 - 15:45 for 1 hour (actually 50 minutes)
Location
Speaker

A classical question in extremal graph theory asks to maximize the number of induced copies of a given graph or tournament in a large host graph, often expressed as a density. A simple averaging argument shows that the limit of this density exists as the host graph is allowed to grow. Razborov's flag algebra method is well suited to generate bounds on these quantities with the help of semidefinite programming. We will explore this method for a few small examples, and see how to modify it to fit our questions. The extremal graphs show some beautiful structures, sometimes fractal like, sometimes quasi random and sometimes even a combination of both.

### An enhanced uncertainty principle

Series
Analysis Seminar
Time
Tuesday, October 6, 2020 - 14:00 for 1 hour (actually 50 minutes)
Location
https://us02web.zoom.us/j/71579248210?pwd=d2VPck1CbjltZStURWRWUUgwTFVLZz09
Speaker
Joaquim Ortega-CerdaUniversity of Barcelona

We improve on some recent results of Sagiv and Steinerberger that quantify the following uncertainty principle: for a function f with mean zero, then either the size of the zero set of the function or the cost of transporting the mass of the positive part of f to its negative part must be big. We also provide a sharp upper estimate of the transport cost of the positive part of an eigenfunction of the Laplacian.

This proves a conjecture of Steinerberger and provides a lower bound of the size of a nodal set of the eigenfunction. Finally, we use a similar technique to provide a measure of how well the points in a design in a manifold are equidistributed. This is a joint work with Tom Carroll and Xavier Massaneda.

### Regression of functions on a low-dimensional set by neural networks

Series
Time
Monday, October 5, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
Bluejeans meeting https://bluejeans.com/759112674
Speaker
Dr. Wenjing LiaoGeorgia Tech

Many data set in image analysis and signal processing is in a high-dimensional space but exhibit low-dimensional structures. For example, data can be modeled as point clouds in a high-dimensional space but are concentrated on a low-dimensional set (or manifold in particular). Our goal is to estimate functions on the low-dimensional manifold from finite samples of data, for statistical inference and prediction. This talk introduces approximation theories of neural networks for functions supported on a low-dimensional manifold. When the function is estimated from finite samples, we give an estimate of the mean squared error for the approximation of these functions. The convergence rate depends on the intrinsic dimension of the manifold instead of the ambient dimension of the data. These results demonstrate that neural networks are adaptive to low-dimensional geometric structures of data.

### Algorithms for minimum norm combinatorial optimization

Series
ACO Alumni Lecture
Time
Friday, October 2, 2020 - 13:05 for 1 hour (actually 50 minutes)
Location
Speaker
Deeparnab ChakrabartyDartmouth College, NH

In many optimization problems, a feasible solution induces a multi-dimensional cost vector. For example, in load-balancing a schedule induces a load vector across the machines. In k-clustering, opening k facilities induces an assignment cost vector across the clients. In this paper we consider the following minimum norm optimization problem : given an arbitrary monotone, symmetric norm, find a solution which minimizes the norm of the induced cost-vector. This generalizes many fundamental NP-hard problems. We give a general framework to tackle the minimum norm problem, and illustrate its efficacy in load balancing and, time permitting, in the clustering setting.

(The speaker is an ACO alum; after the lecture, the speaker will engage with the ACO students for 30-45 minutes.)

### ACO Alumni Lecture series: Algorithms for minimum norm combinatorial optimization

Series
ACO Student Seminar
Time
Friday, October 2, 2020 - 13:00 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/264244877/0166
Speaker
Dr. Deepernab ChakrabartyCS, Dartmouth College

In many optimization problems, a feasible solution induces a multi-dimensional cost vector. For example, in load-balancing a schedule induces a load vector across the machines. In k-clustering, opening k facilities induces an assignment cost vector across the clients. In this paper we consider the following minimum norm optimization problem : given an arbitrary monotone, symmetric norm, find a solution which minimizes the norm of the induced cost-vector. This generalizes many fundamental NP-hard problems. We give a general framework to tackle the minimum norm problem, and illustrate its efficacy in load balancing and, time permitting, in the clustering setting.

### Learning latent combinatorial structures in noisy data

Series
Research Horizons Seminar
Time
Friday, October 2, 2020 - 12:30 for 1 hour (actually 50 minutes)
Location
Microsoft Teams
Speaker
Cheng MaoGeorgia Tech

Learning latent structures in noisy data has been a central task in statistical and computational sciences. For applications such as ranking, matching and clustering, the structure of interest is non-convex and, furthermore, of combinatorial nature. This talk will be a gentle introduction to selected models and methods for statistical inference of such combinatorial structures. I will particularly discuss some of my recent research interests.

### Introduction to Kajiwara-Payne Tropicalization I

Series
Student Algebraic Geometry Seminar
Time
Friday, October 2, 2020 - 09:00 for 1 hour (actually 50 minutes)
Location
Speaker
Trevor GunnGeorgia Tech

The goal of this talk is to present a summary of Sam Payne's 2009 paper "Analytification is the limit of all tropicalizations" (Math. Res. Lett. 16, no. 3 543–556). We will introduce Berkovich analytic spaces, tropicalization of projective varieties, and tropicalization of closed subvarieties of toric varieties, as well as the connections between these concepts. We will try to present many examples.

Note: Part I will focus on tropicalization of affine varieties and Berkovich analytic spaces, Part II will focus on tropicalization of toric varieties and discuss Sam Payne's theorem.

### A precise high-dimensional theory for Boosting

Series
Stochastics Seminar
Time
Thursday, October 1, 2020 - 15:30 for 1 hour (actually 50 minutes)
Location
https://bluejeans.com/276389634
Speaker
Pragya SurHarvard University

This talk will introduce a precise high-dimensional asymptotic theory for Boosting (AdaBoost) on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we will provide a sharp analysis of the stopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of Boosting; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. At the heart of our theory lies an in-depth study of the maximum L1-margin, which can be accurately described by a new system of non-linear equations; we analyze this margin and the properties of this system, using Gaussian comparison techniques and a novel uniform deviation argument. Time permitting, I will present analogous results for a new class of boosting algorithms that correspond to Lq geometry, for q>1. This is based on joint work with Tengyuan Liang.