High-Dimensional Statistics

Department: 
Math
Course Number: 
7252
Hours - Lecture: 
3
Hours - Lab: 
0
Hours - Recitation: 
0
Hours - Total Credit: 
3
Typical Scheduling: 
Starting Fa 21 every Fall

The goal of this PhD level graduate course is to provide a rigorous introduction to concepts and methods of high-dimensional statistics 

having numerous applications in machine learning, data science and signal processing.

Prerequisites: 
Course Text: 

At the level of books by R. Vershynin, High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge University Press, 2018 and C. Giraud, Introduction to High-Dimensional Statistics, CRC Press, Taylor &Francis Group, 2015. Some of the topics are based on recent research papers. 

Topic Outline: 

Empirical processes: symmetrization and multipliers inequalities, concentration inequalities, comparison inequalities for Rademacher process; VC-dimensions, metric entropies and other measures of complexity; entropy and generic chaining bounds for empirical processes.

 

Empirical risk minimization and risk bounds in learning theory: classification and regression 

problems; margin type bounds in classification; oracle inequalities in empirical risk minimization;

generalization bounds for learning machines (kernel machines, boosting, deep neural networks, etc).

 

Sparsity in high-dimensional problems: sparse recovery problems (compressed sensing);

sparse linear regression models, l_1-norm penalization (LASSO) and 

sparsity oracle inequalities.

 

Low-rank matrix recovery: trace regression models, matrix completion and quantum state tomography; nuclear norm penalization and low-rank oracle inequalities.

 

High-dimensional covariance estimation: complexity of covariance estimation problems;

optimal error bounds for sample covariance; high-dimensional and sparse principal component 

analysis (in particular, methods related to random matrix theory).