High-Dimensional Statistics

Course Number: 
Hours - Lecture: 
Hours - Lab: 
Hours - Recitation: 
Hours - Total Credit: 
Typical Scheduling: 
Starting Fa 21 every Fall

The goal of this PhD level graduate course is to provide a rigorous introduction to concepts and methods of high-dimensional statistics 

having numerous applications in machine learning, data science and signal processing.

Course Text: 

At the level of books by R. Vershynin, High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge University Press, 2018 and C. Giraud, Introduction to High-Dimensional Statistics, CRC Press, Taylor &Francis Group, 2015. Some of the topics are based on recent research papers. 

Topic Outline: 

Empirical processes: symmetrization and multipliers inequalities, concentration inequalities, comparison inequalities for Rademacher process; VC-dimensions, metric entropies and other measures of complexity; entropy and generic chaining bounds for empirical processes.


Empirical risk minimization and risk bounds in learning theory: classification and regression 

problems; margin type bounds in classification; oracle inequalities in empirical risk minimization;

generalization bounds for learning machines (kernel machines, boosting, deep neural networks, etc).


Sparsity in high-dimensional problems: sparse recovery problems (compressed sensing);

sparse linear regression models, l_1-norm penalization (LASSO) and 

sparsity oracle inequalities.


Low-rank matrix recovery: trace regression models, matrix completion and quantum state tomography; nuclear norm penalization and low-rank oracle inequalities.


High-dimensional covariance estimation: complexity of covariance estimation problems;

optimal error bounds for sample covariance; high-dimensional and sparse principal component 

analysis (in particular, methods related to random matrix theory).