Robustly Clustering a Mixture of Gaussians

Series
High Dimensional Seminar
Time
Wednesday, March 11, 2020 - 3:00pm for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Santosh Vempala – Georgia Tech
Organizer
Galyna Livshyts

Please Note: We give an efficient algorithm for robustly clustering of a mixture of two arbitrary Gaussians, a central open problem in the theory of computationally efficient robust estimation, assuming only that the the means of the component Gaussians are well-separated or their covariances are well-separated. Our algorithm and analysis extend naturally to robustly clustering mixtures of well-separated logconcave distributions. The mean separation required is close to the smallest possible to guarantee that most of the measure of the component Gaussians can be separated by some hyperplane (for covariances, it is the same condition in the second degree polynomial kernel). Our main tools are a new identifiability criterion based on isotropic position, and a corresponding Sum-of-Squares convex programming relaxation. This is joint work with He Jia.