- You are here:
- GT Home
- Home
- News & Events

Series: Dissertation Defense

We are going talk about three topics. First of all, Principal Components Analysis (PCA) as a dimension reduction technique. We investigate how useful it is for real life problems. The problem is that, often times the spectrum of the covariance matrix is wrongly estimated due to the ratio between sample space dimension over feature space dimension not being large enough. We show how to reconstruct the spectrum of the ground truth covariance matrix, given the spectrum of the estimated covariance for multivariate normal vectors. We then present an algorithm for reconstruction the spectrum in the case of sparse matrices related to text classification.

In the second part, we concentrate on schemes of PCA estimators. Consider the problem of finding the least eigenvalue and eigenvector of ground truth covariance matrix, a famous classical estimator are due to Krasulina. We state the convergence proof of Krasulina for the least eigenvalue and corresponding eigenvector, and then find their convergence rate.

In the last part, we consider the application problem, text classification, in the supervised view with traditional Naive-Bayes method. We find out an updated Naive-Bayes method with a new loss function, which loses the unbiased property of traditional Naive-Bayes method, but obtains a smaller variance of the estimator.

Series: Dissertation Defense

The length LC_n of the longest common subsequences of two strings X = (X_1, ... , X_n) and Y = (Y_1, ... , Y_n) is a way to measure the similarity between X and Y. We study the asymptotic behavior of LC_n when the two strings are generated by a hidden Markov model (Z, (X, Y)) and we build upon asymptotic results for LC_n obtained for sequences of i.i.d. random variables. Under some standard assumptions regarding the model we first prove convergence results with rates for E[LC_n]. Then, versions of concentration inequalities for the transversal fluctuations of LC_n are obtained. Finally, we outline a proof for a central limit theorem by building upon previous work and adapting a Stein's method estimate.

Series: Dissertation Defense

The Four Color Theorem states that every planar graph is 4-colorable. Hajos conjectured that for any positive integer k, every graph containing no K_{k+1}-subdivision is k-colorable. However, Catlin disproved Hajos' conjecture for k >= 6. It is not hard to prove that the conjecture is true for k <= 3. Hajos' conjecture remains open for k = 4 and k = 5. We consider a minimal counterexample to Hajos' conjecture for k = 4: a graph G, such that G contains no K_5-subdivision, G is not 4-colorable, and |V (G)| is minimum. We use Hajos graph to denote such counterexample. One important step to understand graphs containing K_5-subdivisions is to solve the following problem: let H represent the tree on six vertices, two of which are adjacent and of degree 3. Let G be a graph and u1, u2, a1, a2, a3, a4 be distinct vertices of G. When does G contain a topological H (i.e. an H-subdivision) in which u1, u2 are of degree 3 and a1, a2, a3, a4 are of degree 1? We characterize graphs with no topological H. This characterization is used by He, Wang, and Yu to show that graph containing no K_5-subdivision is planar or has a 4-cut, establishing conjecture of Kelmans and Seymour. Besides the topological H problem, we also obtained some further structural information of Hajos graphs.

Series: Dissertation Defense

We model and analyze the dynamics of religious group membership and size. A groups

is distinguished by its strictness, which determines how much time group members are

expected to spend contributing to the group. Individuals differ in their rate of return for

time spent outside of their religious group. We construct a utility function that individ-

uals attempt to maximize, then find a Nash Equilibrium for religious group participation

with a heterogeneous population. We then model dynamics of group size by including

birth, death, and switching of individuals between groups. Group switching depends on

the strictness preferences of individuals and their probability of encountering members of

other groups. We show that in the case of only two groups one with finite strictness and

the other with zero there is a clear parameter combination that determines whether the

non-zero strictness group can survive over time, which is more difficult at higher strictness

levels. At the same time, we show that a higher than average birthrate can allow even the

highest strictness groups to survive. We also study the dynamics of several groups, gaining

insight into strategic choices of strictness values and displaying the rich behavior of the

model. We then move to the simultaneous-move two-group game where groups can set up

their strictnesses strategically to optimize the goals of the group. Affiliations are assumed

to have three types and each type of group has its own group utility function. Analysis

on the utility functions and Nash equilibria presents different behaviors of various types

of groups. Finally, we numerically simulated the process of new groups entering the reli-

gious marketplace which can be viewed as a sequence of Stackelberg games. Simulation

results show how the different types of religious groups distinguish themselves with regard

to strictness.

is distinguished by its strictness, which determines how much time group members are

expected to spend contributing to the group. Individuals differ in their rate of return for

time spent outside of their religious group. We construct a utility function that individ-

uals attempt to maximize, then find a Nash Equilibrium for religious group participation

with a heterogeneous population. We then model dynamics of group size by including

birth, death, and switching of individuals between groups. Group switching depends on

the strictness preferences of individuals and their probability of encountering members of

other groups. We show that in the case of only two groups one with finite strictness and

the other with zero there is a clear parameter combination that determines whether the

non-zero strictness group can survive over time, which is more difficult at higher strictness

levels. At the same time, we show that a higher than average birthrate can allow even the

highest strictness groups to survive. We also study the dynamics of several groups, gaining

insight into strategic choices of strictness values and displaying the rich behavior of the

model. We then move to the simultaneous-move two-group game where groups can set up

their strictnesses strategically to optimize the goals of the group. Affiliations are assumed

to have three types and each type of group has its own group utility function. Analysis

on the utility functions and Nash equilibria presents different behaviors of various types

of groups. Finally, we numerically simulated the process of new groups entering the reli-

gious marketplace which can be viewed as a sequence of Stackelberg games. Simulation

results show how the different types of religious groups distinguish themselves with regard

to strictness.

Series: Dissertation Defense

The Jacobian of a graph, also known as the sandpile group or the critical group, is a finite group abelian group associated to the graph; it has been independently discovered and studied by researchers from various areas. By the Matrix-Tree Theorem, the cardinality of the Jacobian is equal to the number of spanning trees of a graph. In this dissertation, we study several topics centered on a new family of bijections, named the geometric bijections, between the Jacobian and the set of spanning trees. An important feature of geometric bijections is that they are closely related to polyhedral geometry and the theory of oriented matroids despite their combinatorial description; in particular, they can be generalized to Jacobians of regular matroids, in which many previous works on Jacobians failed to generalize due to the lack of the notion of vertices.

Series: Dissertation Defense

The study of the longest common subsequences (LCSs) of two random words is a classical problem in computer science and bioinformatics. A problem of particular probabilistic interest is to determine the limiting behavior of the expectation and variance of the length of the LCS as the length of the random words grows without bounds. This dissertation studies the problem using both Monte-Carlo simulation and theoretical analysis. The specific problems studied include estimating the growth order of the variance, LCS based hypothesis testing method for sequences similarity, theoretical upper bounds for the Chv\'atal-Sankoff constant of multiple sequences, and theoretical growth order of the variance when the two random words have asymmetric distributions.

Series: Dissertation Defense

We will present three results in percolation and sequence analysis. In the first part, we will briefly show an exponential concentration inequality for transversal fluctuation of directed last passage site percolation. In the the second part, we will dive into the power lower bounds for all the r-th central moments ($r\ge1$) of the last passage time of directed site perolcation on a thin box. In the last part, we will partially answer a conjecture raised by Bukh and Zhou that the minimal expected length of the longest common subsequences between two i.i.d. random permutations with arbitrary distribution on the symmetric group is obtained when the distribution is uniform and thus lower bounded by $c\sqrt{n}$ by showing that some distribution can be iteratively constructed such that it gives strictly smaller expectation than uniform distribution and a quick cubic root of $n$ lower bound will also be shown.

Series: Dissertation Defense

In this dissertation, we studied the Back and Forth Error Compensation and Correction (BFECC) method for linear hyperbolic PDE systems and nonlinear scalar conservation laws. We extend the BFECC method from scalar hyperbolic PDEs to linear hyperbolic PDE systems, and showed similar stability and accuracy improvement are still valid under modest assumptions on the systems. Motivated by this theoretical result, we propose BFECC schemes for the Maxwell's equations. On uniform orthogonal grids, the BFECC schemes are guaranteed to be second order accurate and have larger CFL numbers than that of the classical Yee scheme. On non-orthogonal and unstructured grids, we propose to use a simple least square local linear approximation scheme as the underlying scheme for the BFECC method. Numerical results showed the proposed schemes are stable and are second order accurate on non-orthogonal grids and for systems with variable coefficients. We also studied a conservative BFECC limiter that reduces spurious oscillations for numerical solutions of nonlinear scalar conservation laws. Numerical examples with the Burgers' equation and KdV equations are studied to demonstrate effectiveness of this limiter.

Series: Dissertation Defense

The curve complex of Harvey allows combinatorial representation of a surface mappingclass group by describing its action on simple closed curves. Similar complexes of spheres,free factors, and free splittings allow combinatorial representation of the automorphisms ofa free group. We consider a Birman exact sequence for combinatorial models of mappingclass groups and free group automorphisms. We apply this and other extension techniquesto compute the automorphism groups of several simplicial complexes associated with map-ping class groups and automorphisms of free groups.

Series: Dissertation Defense

We provide a new definition of a local walk dimension beta that depends only on the metric. Moreover, we study the local Hausdorff dimension and prove that any variable Ahlfors regular measure of variable dimension Q is strongly equivalent to the local Hausdorff measure with Q the local Hausdorff dimension, generalizing the constant dimensional case. Additionally, we provide constructions of several variable dimensional spaces, including a new example of a variable dimensional Sierpinski carpet. We use the local exponent beta in time-scale renormalization of discrete time random walks, that are approximate at a given scale in the sense that the expected jump size is the order of the space scale. We consider the condition that the expected time to leave a ball scales like the radius of the ball to the power beta of the center. We then study the Gamma and Mosco convergence of the resulting continuous time approximate walks as the space scale goes to zero. We prove that a non-trivial Dirichlet form with Dirichlet boundary conditions on a ball exists as a Mosco limit of approximate forms. We also prove tightness of the associated continuous time processes.