Stochastic and Convex Geometry for the Analysis of Complex Data

Job Candidate Talk
Thursday, February 10, 2022 - 11:00am for 1 hour (actually 50 minutes)
Eliza O’Reilly – California Institute of Technology – eoreilly@caltech.edu
Galyna Livshyts

Many modern problems in data science aim to efficiently and accurately extract important features and make predictions from high dimensional and large data sets. While there are many empirically successful methods to achieve these goals, large gaps between theory and practice remain.  A geometric viewpoint is often useful to address these challenges as it provides a unifying perspective of structure in data, complexity of statistical models, and tractability of computational methods.  As a consequence, an understanding of problem geometry leads both to new insights on existing methods as well as new models and algorithms that address drawbacks in existing methodology.

 In this talk, I will present recent progress on two problems where the relevant model can be viewed as the projection of a lifted formulation with a simple stochastic or convex geometric description. In particular, I will first describe how the theory of stationary random tessellations in stochastic geometry can address computational and theoretical challenges of random decision forests with non-axis-aligned splits. Second, I will present a new approach to convex regression that returns non-polyhedral convex estimators compatible with semidefinite programming. These works open a number of future research directions at the intersection of stochastic and convex geometry, statistical learning theory, and optimization.