Seminars and Colloquia by Series

Series

A geometric analysis of subspace clustering with outliers

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Friday, July 6, 2012 - 15:05 for 1.5 hours (actually 80 minutes)
Location: Skiles 005
Speaker: Mahdi Soltanolkotabi – Stanford University

One of the most fundamental steps in data analysis and dimensionality reduction consists of approximating a given dataset by a single low-dimensional subspace, which is classically achieved via Principal Component Analysis (PCA). However, in many applications, the data often lie near a union of low-dimensional subspaces, reflecting the multiple categories or classes a set of observations may belong to. In this talk we discuss the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. Simply stated the task is to assign each data point to a cluster so as to recover all the hidden subspaces. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We present a novel geometric analysis of an algorithm named sparse subspace clustering (SSC), which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also show that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. We will also demonstrate the effectiveness of these methods by various numerical studies.

Minimax Rates of Estimation for Sparse PCA in High Dimensions

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, February 21, 2012 - 16:00 for 1.5 hours (actually 80 minutes)
Location: Skyles 006
Speaker: Karim Lounici – Georgia Institute of Technology, School of Mathematics

This presentation is based on the papers by D. Paul and I. Johnstone (2007) and V.Q. Vu and J. Lei (2012). Here is the abstract of the second paper. We study the sparse principal components analysis in the high-dimensional setting, where $p$ (the number of variables) can be much larger than $n$ (the number of observations). We prove optimal, non-aymptotics lower bounds and upper bounds on the minimax estimation error for the leading eigenvector when it belongs to an $l_q$ ball for $q\in [0,1]$. Our bound are sharp in $p$ and $n$ for all $q\in[0,1]$ over a wide class of distributions. The upper bound is obtained by analyzing the performance of $l_q$-constrained PCA. In particular, our results provide convergence rates for $l_1$-constrained PCA.

Estimation of Low Rank Kernels on Graphs

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, February 14, 2012 - 16:00 for 1.5 hours (actually 80 minutes)
Location: Skyles 006
Speaker: Vladimir Koltchinskii – Georgia Institute of Technology, School of Mathematics

Let (V, E) be a graph with vertex set V and edge set E. Let (X, X', Y) V \in V × {-1,1} be a random triple, where X, X' are independent uniformly distributed vertices and Y is a label indicating whether X, X' are "similar", or not. Our goal is to estimate the regression function S_*(u, v) = \mathbb{E}(Y|X = u, X' = v), u, v \in V based on n i.i.d. copies (X_1, X'_1, Y_1), ... , (X_n, X'_n, Y_n) of (X, X', Y). We are interested in this problem in the case when S_*: V × V \mapsto [-1,1] is a symmetric low rank kernel (or it can be well approximated by low rank kernels). In addition to this, assume that S_* is "smooth" on the graph. We study estimators based on a modified least squares method with complexity penalization involving both the nuclear norm and Sobolev type norms of symmetric kernels on the graph (defined in terms of its Laplacian). We prove error bounds for such estimators that improve the existing bounds in low rank matrix recovery in the cases when the target kernel is not only low rank, but also sufficiently smooth. The talk is based in part on a joint project with Pedro Rangel.

Estimation of Low Rank Kernels on Graphs -- Rescheduled

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, February 7, 2012 - 16:00 for 1 hour (actually 50 minutes)
Location: Skiles 006
Speaker: Vladimir Koltchinskii – School of Mathematics, Georgia Tech

Please Note: Vladimir Koltchinskii's talk has been rescheduled for next Tuesday, February 14

Randomized Approximation Schemes for Cuts and Flows in Capacitated Graphs

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, November 15, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: skyles 006
Speaker: Yingyu Liang – School of Compter Science, Georgia tech

We will talk about how to approximate an arbitrary graph by a sparse graph with respect to cuts and flows, using random sampling techniques. More specifically, we will describe a near-linear-time randomized combinatorial construction that transforms any graph on n vertices into an O(n log n)-edge graph on the same vertices whose cuts have approximately the same value as the original graph's. The new graph can be used to accelerate cut and flow algorithms, leading to approximate solution on the original graph. The construction algorithms of the sparse graph are based on a general theorem analyzing the concentration of cut values near their expectation in random graphs.

The Price of Uncertainty in Multiagent Systems with Potentials

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, November 8, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: skyles 006
Speaker: Steven Ehrlich – School of Computer Science, Georgia tech

Multi-agent systems have been studied extensively through the lens of game theory. However, most game theoretic models make strong assumptions about agents accuracy of knowledge about their utility and the interactions of other players. We will show some progress at relaxing this assumption. In particular, we look at adversarial noise in specific potential games, and assess the effect of noise on the quality of outcomes. In some cases, very small noise can accumulate over the course of the dynamics and lead to much worse social welfare. We define the Price of Uncertainty to measure this, and we compute both upper and lower bounds on this quantity for particular games.

architectures for sampling of correlated signals

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, October 4, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: skyles 006
Speaker: Ali Ahmed – School of Electrical and Computer Engineering, Georgia Tech

Low-rank/Sparse matrix decomposition II

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, September 20, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: skyles 006
Speaker: Karim Lounici – School of mathematics, Georgia institute of Technology

Low-rank/Sparse matrix decomposition

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, September 13, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: skyles 006
Speaker: Karim Lounici – School of mathematics, Georgia institute of Technology

Low Rank Matrix Recovery: Low Coherence Conditions II

Series: High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time: Tuesday, September 6, 2011 - 16:00 for 1.5 hours (actually 80 minutes)
Location: Skyles 006
Speaker: Vladimir Koltchinskii – School of mathematics, Georgia institute of Technology

Georgia Institute of Technology College of Sciences

Search form