Seminars and Colloquia by Series

Forbidden Subgraphs and 3-Colorability

Series
Dissertation Defense
Time
Monday, May 21, 2012 - 14:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Tianjun YeSchool of Mathematics, Georgia Tech
Classical vertex coloring problems ask for the minimum number of colors needed to color the vertices of a graph, such that adjacent vertices use different colors. Vertex coloring does have quite a few practical applications in communication theory, industry engineering and computer science. Such examples can be found in the book of Hansen and Marcotte. Deciding whether a graph is 3-colorable or not is a well-known NP-complete problem, even for triangle-free graphs. Intutively, large girth may help reduce the chromatic number. However, in 1959, Erdos used the probabilitic method to prove that for any two positive integers g and k, there exist graphs of girth at least g and chromatic number at least k. Thus, restricting girth alone does not help bound the chromatic number. However, if we forbid certain tree structure in addition to girth restriction, then it is possible to bound the chromatic number. Randerath determined several such tree structures, and conjectured that if a graph is fork-free and triangle-free, then it is 3-colorable, where a fork is a star K1,4 with two branches subdivided once. The main result of this thesis is that Randerath's conjecture is true for graphs with odd girth at least 7. We also give an outline of a proof that Randerath's conjecture holds for graphs with maximum degree 4.

Non-Asymptotic Bounds for Prediction Problems and Density Estimation

Series
Dissertation Defense
Time
Tuesday, May 1, 2012 - 15:00 for 2 hours
Location
Skiles 005
Speaker
Stanislav MinskerSchool of Mathematics, Georgia Tech
This dissertation investigates the statistical learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y)\in R^d\times {\pm 1} with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. The prediction rule is constructed based on the observations (X_i,Y_i)_{i=1}^n sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for performance of the active learning methods under certain assumptions. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this work is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P, with X taking its values in some metric space S and Y - in a bounded subset of R. Given a collection of functions H={h_t}_{t\in \mb T} mapping S to R, the goal of dictionary learning is to construct a prediction rule for Y given by a linear (or convex) combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance. Finally, we provide similar bounds in the density estimation framework.

Perturbation Theory and its Application to Complex Biological Networks --A quantification of systematic features of biological networks

Series
Dissertation Defense
Time
Tuesday, April 24, 2012 - 11:00 for 2 hours
Location
Skiles 006
Speaker
Yao LiSchool of Mathematics, Georgia Tech
The primary objective of this thesis is to make a quantitative study of complex biological networks. Our fundamental motivation is to obtain the statistical dependency between modules by injecting external noise. To accomplish this, a deep study of stochastic dynamical systems would be essential. The first part is about the stochastic dynamical system theory. The classical estimation of invariant measures of Fokker-Planck equations is improved by the level set method. Further, we develop a discrete Fokker-Planck-type equation to study the discrete stochastic dynamical systems. In the second part, we quantify systematic measures including degeneracy, complexity and robustness. We also provide a series of results on their properties and the connection between them. Then we apply our theory to the JAK-STAT signaling pathway network.

Markov Chains at the Interface of Combinatorics, Computing, and Statistical Physics

Series
Dissertation Defense
Time
Wednesday, March 14, 2012 - 13:00 for 2 hours
Location
Skiles 005
Speaker
Amanda Pascoe StreibSchool of Mathematics, Georgia Tech
The fields of statistical physics, discrete probability, combinatorics, and theoretical computer science have converged around efforts to understand random structures and algorithms. Recent activity in the interface of these fields has enabled tremendous breakthroughs in each domain and has supplied a new set of techniques for researchers approaching related problems. This thesis makes progress on several problems in this interface whose solutions all build on insights from multiple disciplinary perspectives. First, we consider a dynamic growth process arising in the context of DNA-based self-assembly. The assembly process can be modeled as a simple Markov chain. We prove that the chain is rapidly mixing for large enough bias in regions of Z^d. The proof uses a geometric distance function and a variant of path coupling in order to handle distances that can be exponentially large. We also provide the first results in the case of fluctuating bias, where the bias can vary depending on the location of the tile, which arises in the nanotechnology application. Moreover, we use intuition from statistical physics to construct a choice of the biases for which the Markov chain M_{mon} requires exponential time to converge. Second, we consider a related problem regarding the convergence rate of biased permutations that arises in the context of self-organizing lists. The Markov chain M_{nn} in this case is a nearest-neighbor chain that allows adjacent transpositions, and the rate of these exchanges is governed by various input parameters. It was conjectured that the chain is always rapidly mixing when the inversion probabilities are positively biased, i.e., we put nearest neighbor pair x < y in order with bias 1/2 <= p_{xy} <= 1 and out of order with bias 1-p_{xy}. The Markov chain M_{mon} was known to have connections to a simplified version of this biased card-shuffling. We provide new connections between M_{nn} and M_{mon} by using simple combinatorial bijections, and we prove that M_{nn} is always rapidly mixing for two general classes of positively biased {p_{xy}}. More significantly, we also prove that the general conjecture is false by exhibiting values for the p_{xy}, with 1/2 <= p_{xy} <= 1 for all x < y, but for which the transposition chain will require exponential time to converge. Finally, we consider a model of colloids, which are binary mixtures of molecules with one type of molecule suspended in another. It is believed that at low density typical configurations will be well-mixed throughout, while at high density they will separate into clusters. This clustering has proved elusive to verify, since all local sampling algorithms are known to be inefficient at high density, and in fact a new nonlocal algorithm was recently shown to require exponential time in some cases. We characterize the high and low density phases for a general family of discrete interfering binary mixtures by showing that they exhibit a "clustering property" at high density and not at low density. The clustering property states that there will be a region that has very high area, very small perimeter, and high density of one type of molecule. Special cases of interfering binary mixtures include the Ising model at fixed magnetization and independent sets.

Pages