- You are here:
- GT Home
- Home
- News & Events

Series: Stochastics Seminar

Robustness of several nonparametric multivariate "threshold type" outlier identification procedures is studied, employing a masking breakdown point criterion subject to a fixed false positive rate. The procedures are based on four different outlyingness functions: the widely-used "Mahalanobis distance" version, a new one based on a "Mahalanobis quantile" function that we introduce, one based on the well-known "halfspace" depth, and one based on the well-known "projection" depth. In this treatment, multivariate location outlyingness functions are formulated as extensions of univariate versions using either "substitution" or "projection pursuit," and an equivalence paradigm relating multivariate depth, outlyingness, quantile, and centered rank functions is applied. Of independent interest, the new "Mahalanobis quantile" outlyingness function is not restricted to have elliptical contours, has a transformation-retransformation representation in terms of the well-known spatial outlyingness function, and corrects to full affine invariance the orthogonal invariance of that function. Here two special tools, also of independent interest, are introduced and applied: a notion of weak covariance functional, and a very general and flexible formulation of affine equivariance for multivariate quantile functions. The new Mahalanobis quantile function inherits attractive features of the spatial version, such as computational ease and a Bahadur-Kiefer representation. For the particular outlyingness functions under consideration, masking breakdown points are evaluated and compared within a contamination model. It is seen that for threshold type outlier identification the Mahalanobis distance and projection procedures are superior to the others, although all four procedures are quite suitable for robust ranking of points with respect to outlyingness. Reasons behind these differences are discussed, and directions for further study are indicated.

Series: Stochastics Seminar

I will discuss some recent (but modest) results showing the existence and slow mixing of a stationary chain of Hamiltonian oscillators subject to a heat bath. Surprisingly, even these simple results require some delicate stochastic averaging. This is joint work with Martin Hairer.

Series: Stochastics Seminar

Under certain conditions, we obtain exact asymptotic expressions for the stationary distribution \pi of a Markov chain. In this talk, we will consider Markov chains on {0,1,...}^2. We are particularly interested in deriving asymptotic expressions when the fluid limit of the most probable paths from the origin to the rare event are nonlinear. For example, we will derive asymptotic expressions for a large deviation along the x-axis (e.g., \pi(\ell, y) for fixed y) when the most probable paths to (\ell,y) initially climb the y-axis before turning southwest and drifting towards (\ell,y).

Series: Stochastics Seminar

A common subsequence of two sequences X and Y is a sequence which is a subsequence of X as well as a subsequence of Y. A Longest Common Subsequence (LCS) of X and Y is a common subsequence with maximal length. Longest Common subsequences can be represented as alignments with gaps where the aligned letter pairs corresponds to the letters in the LCS. We consider two independent i.i.d. binary texts X and Y of length n. We show that the behavior of the the alignment corresponding to the LCS is very different depending on the number of colors. With 2-colors, long blocks tend to be aligned with no gaps, whilst for four or more colors the opposite is true. Let Ln denote the length of the LCS of X and Y. In general the order of the variance of Ln is not known. We explain how a biased affect of a finite pattern can influence the order of the fluctuation of Ln.

Series: Stochastics Seminar

We consider a random field of tensor product type X and investigate the quality of approximation (both in the average and in the probabilistic sense) to X by the processes of rank n minimizing the quadratic approximation error. Most interesting results are obtained for the case when the dimension of parameter set tends to infinity. Call "cardinality" the minimal n providing a given level of approximation accuracy. By applying Central Limit Theorem to (deterministic) array of covariance eigenvalues, we show that, for any fixed level of relative error, this cardinality increases exponentially (a phenomenon often called "intractability" or "dimension curse") and find the explosion coefficient. We also show that the behavior of the probabilistic and average cardinalities is essentially the same in the large domain of parameters.