A semidefinite relaxation of k-means clustering

Series
Analysis Seminar
Time
Wednesday, April 20, 2016 - 2:05pm for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Dustin Mixon – Ohio state University
Organizer
Shahaf Nitzan
Recently, Awasthi et al proved that a semidefinite relaxation of the k-means clustering problem is tight under a particular data model called the stochastic ball model. This result exhibits two shortcomings: (1) naive solvers of the semidefinite program are computationally slow, and (2) the stochastic ball model prevents outliers that occur, for example, in the Gaussian mixture model. This talk will cover recent work that tackles each of these shortcomings. First, I will discuss a new type of algorithm (introduced by Bandeira) that combines fast non-convex solvers with the optimality certificates provided by convex relaxations. Second, I will discuss how to analyze the semidefinite relaxation under the Gaussian mixture model. In this case, outliers in the data obstruct tightness in the relaxation, and so fundamentally different techniques are required. Several open problems will be posed throughout.This is joint work with Takayuki Iguchi and Jesse Peterson (AFIT), as well as Soledad Villar and Rachel Ward (UT Austin).