Seminars and Colloquia by Series

Sparsity pattern aggregation in generalized linear models.

Series
Stochastics Seminar
Time
Thursday, September 3, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Philippe RigolletPrinceton University
The goal of this talk is to present a new method for sparse estimation which does not use standard techniques such as $\ell_1$ penalization. First, we introduce a new setup for aggregation which bears strong links with generalized linear models and thus encompasses various response models such as Gaussian regression and binary classification. Second, by combining maximum likelihood estimators using exponential weights we derive a new procedure for sparse estimations which satisfies exact oracle inequalities with the desired remainder term. Even though the procedure is simple, its implementation is not straightforward but it can be approximated using the Metropolis algorithm which results in a stochastic greedy algorithm and performs surprisingly well in a simulated problem of sparse recovery.

Penalized orthogonal-components regression for large p small n data

Series
Stochastics Seminar
Time
Thursday, August 27, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Dabao ZhangPurdue University
We propose a penalized orthogonal-components regression (POCRE) for large p small n data. Orthogonal components are sequentially constructed to maximize, upon standardization, their correlation to the response residuals. A new penalization framework, implemented via empirical Bayes thresholding, is presented to effectively identify sparse predictors of each component. POCRE is computationally efficient owing to its sequential construction of leading sparse principal components. In addition, such construction offers other properties such as grouping highly correlated predictors and allowing for collinear or nearly collinear predictors. With multivariate responses, POCRE can construct common components and thus build up latent-variable models for large p small n data. This is an joint work with Yanzhu Lin and Min Zhang

Omnibus Tests for Comparison of Competing Risks under the Additive Risk Model

Series
Stochastics Seminar
Time
Thursday, April 23, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Yichuan ZhaoDepartment of Mathematics, Georgia State University
It is of interest that researchers study competing risks in which subjects may fail from any one of k causes. Comparing any two competing risks with covariate effects is very important in medical studies. In this talk, we develop omnibus tests for comparing cause-specific hazard rates and cumulative incidence functions at specified covariate levels. The omnibus tests are derived under the additive risk model by a weighted difference of estimates of cumulative cause-specific hazard rates. Simultaneous confidence bands for the difference of two conditional cumulative incidence functions are also constructed. A simulation procedure is used to sample from the null distribution of the test process in which the graphical and numerical techniques are used to detect the significant difference in the risks. In addition, we conduct a simulation study, and the simulation result shows that the proposed procedure has a good finite sample performance. A melanoma data set in clinical trial is used for the purpose of illustration.

Excess Risk Bounds in Binary Classification

Series
Stochastics Seminar
Time
Thursday, April 16, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Vladimir I. KoltchinskiiSchool of Mathematics, Georgia Tech
In binary classification problems, the goal is to estimate a function g*:S -> {-1,1} minimizing the generalization error (or the risk) L(g):=P{(x,y):y \neq g(x)}, where P is a probability distribution in S x {-1,1}. The distribution P is unknown and estimators \hat g of g* are based on a finite number of independent random couples (X_j,Y_j) sampled from P. It is of interest to have upper bounds on the excess risk {\cal E}(\hat g):=L(\hat g) - L(g_{\ast}) of such estimators that hold with a high probability and that take into account reasonable measures of complexity of classification problems (such as, for instance, VC-dimension). We will discuss several approaches (both old and new) to excess risk bounds in classification, including some recent results on excess risk in so called active learning.

Cameron-Martin theorem for Complete Noncompact Riemannian Manifold

Series
Stochastics Seminar
Time
Thursday, April 9, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Elton HsuDepartment of Mathematics, Northwestern University
The Cameron-Martin theorem is one of the cornerstones of stochastic analysis. It asserts that the shifts of the Wiener measure along certain flows are equivalent. Driver and others have shown that this theorem, after an appropriate reformulation, can be extension to the Wiener measure on the path space over a compact Riemannian manifold. In this talk we will discuss this and other extensions of the Cameron-Martin theorem and show that it in fact holds for an arbitrary complete Riemannian manifold.

Dynamic Server Allocation for Tandem Queues with Flexible Servers

Series
Stochastics Seminar
Time
Thursday, March 12, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Hayrie AyhanISyE, Georgia Tech
We consider Markovian tandem queues with finite intermediate buffers and flexible servers and study how the servers should be assigned dynamically to stations in order to obtain optimal long-run average throughput. We assume that each server can work on only one job at a time, that several servers can work together on a single job, and that the travel times between stations are negligible. Under various server collaboration schemes, we characterize the optimal server assignment policy for these systems.

Shot Noise Process

Series
Stochastics Seminar
Time
Thursday, March 5, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Yuanhui XiaoDepartment of Mathematics and Statistics, Georgia State University
A shot noise process is essentially a compound Poisson process whereby the arriving shots are allowed to accumulate or decay after their arrival via some preset shot (impulse response) function. Shot noise models see applications in diverse areas such as insurance, fi- nance, hydrology, textile engineering, and electronics. This talk stud- ies several statistical inference issues for shot noise processes. Under mild conditions, ergodicity is proven in that process sample paths sat- isfy a strong law of large numbers and central limit theorem. These results have application in storage modeling. Shot function parameter estimation from a data history observed on a discrete-time lattice is then explored. Optimal estimating functions are tractable when the shot function satisfies a so-called interval similar condition. Moment methods of estimation are easily applicable if the shot function is com- pactly supported and show good performance. In all cases, asymptotic normality of the proposed estimators is established.

Scenery reconstruction part II

Series
Stochastics Seminar
Time
Thursday, February 26, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Henri MatzingerSchool of Mathematics, Georgia Tech
Last week we saw combinatorial reconstruction. This time we are going to explain a new approach to Scenery Reconstruction. This new approach could allow us to prove that being able to distinguish sceneries implies reconstructability.

Optimal alignments and sceneries

Series
Stochastics Seminar
Time
Thursday, February 19, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Heinrich MatzingerSchool of Mathematics, Georgai Tech
We explore the connection between Scenery Reconstruction and Optimal Alignments. We present some new algorithms which work in practise and not just in theory, to solve the Scenery Reconstruction problem

On creating a model assessment tool independent of data size and estimating the U statistic variance

Series
Stochastics Seminar
Time
Thursday, February 12, 2009 - 15:00 for 1 hour (actually 50 minutes)
Location
Skiles 269
Speaker
Jiawei LiuDepartment of Mathematics & Statistics, Georgia State University
If viewed realistically, models under consideration are always false. A consequence of model falseness is that for every data generating mechanism, there exists a sample size at which the model failure will become obvious. There are occasions when one will still want to use a false model, provided that it gives a parsimonious and powerful description of the generating mechanism. We introduced a model credibility index, from the point of view that the model is false. The model credibility index is defined as the maximum sample size at which samples from the model and those from the true data generating mechanism are nearly indistinguishable. Estimating the model credibility index is under the framework of subsampling, where a large data set is treated as our population, subsamples are generated from the population and compared with the model using various sample sizes. Exploring the asymptotic properties of the model credibility index is associated with the problem of estimating variance of U statistics. An unbiased estimator and a simple fix-up are proposed to estimate the U statistic variance.

Pages