Optimal prediction in the linearly transformed spiked model

Stochastics Seminar
Thursday, September 21, 2017 - 15:05
1 hour (actually 50 minutes)
Skiles 006
University of Pennsylvania, Wharton School
We consider the $\textit{linearly transformed spiked model}$, where observations $Y_i$ are noisy linear transforms of unobserved signals of interest $X_i$: $$Y_i = A_i X_i + \varepsilon_i,$$ for $i=1,\ldots,n$. The transform matrices $A_i$ are also observed. We model $X_i$ as random vectors lying on an unknown low-dimensional space. How should we predict the unobserved signals (regression coefficients) $X_i$? The naive approach of performing regression for each observation separately is inaccurate due to the large noise. Instead, we develop optimal linear empirical Bayes methods for predicting $X_i$ by "borrowing strength'' across the different samples. Our methods are applicable to large datasets and rely on weak moment assumptions. The analysis is based on random matrix theory. We discuss applications to signal processing, deconvolution, cryo-electron microscopy, and missing data in the high-noise regime. For missing data, we show in simulations that our methods are faster, more robust to noise and to unequal sampling than well-known matrix completion methods. This is joint work with William Leeb and Amit Singer from Princeton, available as a preprint at arxiv.org/abs/1709.03393.