- You are here:
- Home
Department:
Math
Course Number:
7252
Hours - Lecture:
3
Hours - Lab:
0
Hours - Recitation:
0
Hours - Total Credit:
3
Typical Scheduling:
Starting Fa 21 every Fall
The goal of this PhD level graduate course is to provide a rigorous introduction to concepts and methods of high-dimensional statistics
having numerous applications in machine learning, data science and signal processing.
Course Text:
At the level of books by R. Vershynin, High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge University Press, 2018 and C. Giraud, Introduction to High-Dimensional Statistics, CRC Press, Taylor &Francis Group, 2015. Some of the topics are based on recent research papers.
Topic Outline:
Empirical processes: symmetrization and multipliers inequalities, concentration inequalities, comparison inequalities for Rademacher process; VC-dimensions, metric entropies and other measures of complexity; entropy and generic chaining bounds for empirical processes.
Empirical risk minimization and risk bounds in learning theory: classification and regression
problems; margin type bounds in classification; oracle inequalities in empirical risk minimization;
generalization bounds for learning machines (kernel machines, boosting, deep neural networks, etc).
Sparsity in high-dimensional problems: sparse recovery problems (compressed sensing);
sparse linear regression models, l_1-norm penalization (LASSO) and
sparsity oracle inequalities.
Low-rank matrix recovery: trace regression models, matrix completion and quantum state tomography; nuclear norm penalization and low-rank oracle inequalities.
High-dimensional covariance estimation: complexity of covariance estimation problems;
optimal error bounds for sample covariance; high-dimensional and sparse principal component
analysis (in particular, methods related to random matrix theory).