Excess Risk Bounds in Binary Classification

Series
Stochastics Seminar
Time
Thursday, April 16, 2009 - 3:00pm for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Vladimir I. Koltchinskii – School of Mathematics, Georgia Tech
Organizer
Heinrich Matzinger
In binary classification problems, the goal is to estimate a function g*:S -> {-1,1} minimizing the generalization error (or the risk) L(g):=P{(x,y):y \neq g(x)}, where P is a probability distribution in S x {-1,1}. The distribution P is unknown and estimators \hat g of g* are based on a finite number of independent random couples (X_j,Y_j) sampled from P. It is of interest to have upper bounds on the excess risk {\cal E}(\hat g):=L(\hat g) - L(g_{\ast}) of such estimators that hold with a high probability and that take into account reasonable measures of complexity of classification problems (such as, for instance, VC-dimension). We will discuss several approaches (both old and new) to excess risk bounds in classification, including some recent results on excess risk in so called active learning.