Vectors, Sampling and Massive Data

Series
ACO Distinguished Lecture
Time
Tuesday, November 1, 2011 - 4:30pm for 1 hour (actually 50 minutes)
Location
Klaus 1116
Speaker
Ravi Kannan – Microsoft Research India
Organizer
Robin Thomas

Please Note: There will be a reception in the Atrium of the Klaus building at 4PM.

Modeling data as high-dimensional (feature) vectors is a staple in Computer Science, its use in ranking web pages reminding us again of its effectiveness. Algorithms from Linear Algebra (LA) provide a crucial toolkit. But, for modern problems with massive data, these algorithms may take too long. Random sampling to reduce the size suggests itself. I will give a from-first-principles description of the LA connection, then discuss sampling techniques developed over the last decade for vectors, matrices and graphs. Besides saving time, sampling leads to sparsification and compression of data. Speaker's bio