Topics in Sequence Analysis
- Series
- Dissertation Defense
- Time
- Monday, November 5, 2012 - 12:30 for 1.5 hours (actually 80 minutes)
- Location
- Skiles 005
- Speaker
- Jinyong Ma – School of Mathematics, Georgia Tech
This work studies two topics in sequence analysis. In the first part, we
investigate the large deviations of the shape of the random RSK Young
diagrams, associated with a random word of size n whose letters are
independently drawn from an alphabet of size m=m(n). When the letters are
drawn uniformly and when both n and m converge together to infinity, m
not growing too fast with respect to n, the large deviations of the shape
of the Young diagrams are shown to be the same as that of the spectrum of
the traceless GUE. Since the length of the top row of the Young diagrams is
the length of the longest (weakly) increasing subsequence of the random
word, the corresponding large deviations follow. When the letters are drawn
with non-uniform probability, a control of both highest probabilities will
ensure that the length of the top row of the diagrams satisfies a large
deviation principle. In either case, speeds and rate functions are
identified. To complete this first part, non-asymptotic concentration bounds
for the length of the top row of the diagrams are obtained.
In the second part, we investigate the order of the r-th, 1\le r <
+\infty, central moment of the length of the longest common subsequence of
two independent random words of size n whose letters are identically
distributed and independently drawn from a finite alphabet. When all but one
of the letters are drawn with small probabilities, which depend on the size
of the alphabet, the r-th central moment is shown to be of order
n^{r/2}. In particular, when r=2, the order of the variance is linear.