## This Week's Seminars and Colloquia

Monday, April 22, 2019 - 12:50 , Location: Skiles 005 , Joe Kileel , Princeton University , , Organizer: Justin Chen

This talk will be about polynomial decompositions that are relevant in machine learning.  I will start with the well-known low-rank symmetric tensor decomposition, and present a simple new algorithm with local convergence guarantees, which seems to handily outperform the state-of-the-art in experiments.  Next I will consider a particular generalization of symmetric tensor decomposition, and apply this to estimate subspace arrangements from very many, very noisy samples (a regime in which current subspace clustering algorithms break down).  Finally I will switch gears and discuss representability of polynomials by deep neural networks with polynomial activations.  The various polynomial decompositions in this talk motivate questions in commutative algebra, computational algebraic geometry and optimization.  The first part of this talk is joint with Emmanuel Abbe, Tamir Bendory, Joao Pereira and Amit Singer, while the latter part is joint with Matthew Trager.

Monday, April 22, 2019 - 13:55 , Location: Skiles 005 , , University of South Carolina , , Organizer: Wenjing Liao

The talk presents an extension for high dimensions of an idea from a recent result concerning near optimal adaptive finite element methods (AFEM). The usual adaptive strategy for finding conforming partitions in AFEM is ”mark → subdivide → complete”. In this strategy any element can be marked for subdivision but since the resulting partition often contains hanging nodes, additional elements have to be subdivided in the completion step to get a conforming partition. This process is very well understood for triangulations received via newest vertex bisection procedure. In particular, it is proven that the number of elements in the final partition is limited by constant times the number of marked cells. This motivated us [B., Fierro, Veeser, in preparation] to design a marking procedure that is limited only to cells of the partition whose subdivision will result in a conforming partition and therefore no completion step is necessary. We also proved that this procedure is near best in terms of both error of approximation and complexity. This result is formulated in terms of tree approximations and opens the possibility to design similar algorithms in high dimensions using sparse occupancy trees introduced in [B., Dahmen, Lamby, 2011]. The talk describes the framework of approximating high dimensional data using conforming sparse occupancy trees.

Monday, April 22, 2019 - 14:00 , Location: Skiles 006 , Adam Levine , Duke University , Organizer: Caitlin Leverson

Given an m-dimensional manifold M that is homotopy equivalent to an n-dimensional manifold N (where n<m), a spine of M is a piecewise-linear embedding of N into M (not necessarily locally flat) realizing the homotopy equivalence. When m-n=2 and m>4, Cappell and Shaneson showed that if M is simply-connected or if m is odd, then it contains a spine. In contrast, I will show that there exist smooth, compact, simply-connected 4-manifolds which are homotopy equivalent to the 2-sphere but do not contain a spine (joint work with Tye Lidman). I will also discuss some related results about PL concordance of knots in homology spheres (joint with Lidman and Jen Hom).

Monday, April 22, 2019 - 15:30 , Location: Skiles 006 , Eli Grigsby , Boston College , Organizer: Caitlin Leverson

One can regard a (trained) feedforward neural network as a particular type of function $\mathbb{R}^d \rightarrow (0,1)$, where $\mathbb{R}^d$ is a (typically high-dimensional) Euclidean space parameterizing some data set, and the value $N(x) \in (0,1)$ of the function on a data point $x$ is the probability that the answer to a particular yes/no question is "yes." It is a classical result in the subject that a sufficiently complex neural network can approximate any function on a bounded set. Last year, J. Johnson proved that universality results of this kind depend on the architecture of the neural network (the number and dimensions of its hidden layers). His argument was novel in that it provided an explicit topological obstruction to representability of a function by a neural network, subject to certain simple constraints on its architecture. I will tell you just enough about neural networks to understand how Johnson's result follows from some very simple ideas in piecewise linear geometry. Time permitting, I will also describe some joint work in progress with K. Lindsey aimed at developing a general theory of how the architecture of a neural network constrains its topological expressiveness.

Series: Other Talks
Tuesday, April 23, 2019 - 15:00 , Location: Skiles 006 , , Georgia Institute of Technology , , Organizer: Jaemin Park

We study whether all stationary solutions of 2D Euler equation must be radially symmetric, if the vorticity is compactly supported or has some decay at infinity. Our main results are the following:

(1) On the one hand, we are able to show that for any non-negative smooth stationary vorticity  that is compactly supported (or has certain decay as |x|->infty), it must be radially symmetric up to a translation.

(2) On the other hand, if we allow vorticity to change sign, then by applying bifurcation arguments to sign-changing radial patches, we are able to show that there exists a compactly-supported, sign-changing smooth stationary vorticity that is non-radial.

We have also obtained some symmetry results for uniformly-rotating solutions for 2D Euler equation, as well as stationary/rotating solutions for the SQG equation. The symmetry results are mainly obtained by calculus of variations and elliptic equation techniques. This is a joint work with Javier Gomez-Serrano, Jia Shi and Yao Yao.

Wednesday, April 24, 2019 - 00:05 , Location: Skiles 006 , Lutz Warnke , Georgia Tech

During the last 30 years there has been much interest in random graph processes, i.e., random graphs which grow by adding edges (or vertices) step-by-step in some random way. Part of the motivation stems from more realistic modeling, since many real world networks such as Facebook evolve over time. Further motivation stems from extremal combinatorics, where these processes lead to some of the best known bounds in Ramsey and Turan Theory (that go beyond textbook applications of the probabilistic method). I will review several random graph processes of interest, and (if time permits) illustrate one of the main proof techniques using a simple toy example.

Friday, April 26, 2019 - 12:00 , Location: Skiles 006 , Jaewoo Jung , Georgia Institute of Technology , , Organizer: Trevor Gunn

It is known that non-negative homogeneous polynomials(forms) over $\mathbb{R}$ are same as sums of squares if it is bivariate, quadratic forms, or ternary quartic by Hilbert. Once we know a form is a sum of squares, next natural question would be how many forms are needed to represent it as sums of squares. We denote the minimal number of summands in the sums of squares by rank (of the sum of squares). Ranks of some class of forms are known. For example, any bivariate forms (allowing all monomials) can be written as sum of $2$ squares.(i.e. its rank is $2$) and every nonnegative ternary quartic can be written as a sum of $3$ squares.(i.e. its rank is $3$). Our question is that "if we do not allow some monomials in a bivariate form, how its rank will be?". In the talk, we will introduce this problem in algebraic geometry flavor and provide some notions and tools to deal with.

Sunday, April 28, 2019 - 15:05 , Location: 006 , Liza Rebrova , UCLA

I will talk about the structure of large square random matrices with centered i.i.d. heavy-tailed entries (only two finite moments are assumed). In our previous work with R. Vershynin we have shown that the operator norm of such matrix A can be reduced to the optimal sqrt(n)-order with high probability by zeroing out a small submatrix of A, but did not describe the structure of this "bad" submatrix, nor provide a constructive way to find it. Now we can give a very simple description of this small "bad" subset: it is enough to zero out a small fraction of the rows and columns of A with largest L2 norms to bring its operator norm to the almost optimal sqrt(loglog(n)*n)-order, under additional assumption that the entries of A are symmetrically distributed. As a corollary, one can also obtain a constructive procedure to find a small submatrix of A that one can zero out to achieve the same regularization.

I am planning to discuss some details of the proof, the main component of which is the development of techniques that extend constructive regularization approaches known for the Bernoulli matrices (from the works of Feige and Ofek, and Le, Levina and Vershynin) to the considerably broader class of heavy-tailed random matrices.