The Price of Fair PCA: One Extra Dimension

ACO Student Seminar
Friday, October 19, 2018 - 1:05pm for 1 hour (actually 50 minutes)
Skiles 005
Samira Samadi – CS, Georgia Tech – ssamadi6@gatech.edu
He Guo
We investigate whether the standard dimensionality reduction techniques inadvertently produce data representations with different fidelity for two different populations. We show on several real-world datasets, PCA has higher reconstruction error on population A than B (for example, women versus men or lower versus higher-educated individuals). This can happen even when the dataset has similar number of samples from A and B . This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A as B . We give an efficient algorithm for finding a projection which is nearly-optimal with respect to this measure, and evaluate it on several datasets. This is a joint work with Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.