## Seminars and Colloquia by Series

### Clustering strings with mutations using an expectation-maximization algorithm

Series
Mathematical Biology Seminar
Time
Wednesday, October 2, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Afaf Saaidi Georgia Tech

An expectation-maximization (EM) algorithm is a powerful clustering method that was initially developed to fit Gaussian mixture distributions. In the absence of a particular probability density function, an EM algorithm aims to estimate the "best" function that maximizes the likelihood of data being generated by the model. We present an EM algorithm which addresses the problem of clustering "mutated" substrings of similar parent strings such that each substring is correctly assigned to its parent string. This problem is motivated by the process of simultaneously reading similar RNA sequences during which various substrings of the sequence are produced and could be mutated; that is, a substring may have some letters changed during the reading process. Because the original RNA sequences are similar, a substring is likely to be assigned to the wrong original sequence. We describe our EM algorithm and present a test on a simulated benchmark which shows that our method yields a better assignment of the substrings than what has been achieved by previous methods. We conclude by discussing how this assignment problem applies to RNA structure prediction.

### Insertions on Double Occurrence Words

Series
Mathematical Biology Seminar
Time
Wednesday, September 25, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Daniel CruzGeorgia Tech

A double occurrence word (DOW) is a word in which every symbol appears exactly twice; two DOWs are equivalent if one is a symbol-to-symbol image of the other. In the context of genomics, DOWs and operations on DOWs have been used in studies of DNA rearrangement. By modeling the DNA rearrangement process using DOWs, it was observed that over 95% of the scrambled genome of the ciliate Oxytricha trifallax could be described by iterative insertions of the repeat pattern'' and the return pattern''. These patterns generalize square and palindromic factors of DOWs, respectively. We introduce a notion of inserting repeat/return words into DOWs and study how two distinct insertions into the same word can produce equivalent DOWs. Given a DOW w, we characterize the structure of  w which allows two distinct insertions to yield equivalent DOWs. This characterization depends on the locations of the insertions and on the length of the inserted repeat/return words and implies that when one inserted word is a repeat word and the other is a return word, then both words must be trivial (i.e., have only one symbol). The characterization also introduces a method to generate families of words recursively.

### Species network inference under the coalescent model

Series
Mathematical Biology Seminar
Time
Wednesday, September 18, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Hector BanosGeorgia Tech

When hybridization plays a role in evolution, networks are necessary to describe species-level relationships. In this talk, we show that most topological features of a level-1 species network (networks with no interlocking cycles) are identifiable from gene tree topologies under the network multispecies coalescent model (NMSC). We also present the theory behind NANUQ, a new practical method for the inference of level-1 networks under the NMSC.

### The geometry of phylogenetic tree spaces

Series
Mathematical Biology Seminar
Time
Wednesday, September 11, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Bo Lin Georgia Tech

Phylogenetic trees  are  the fundamental  mathematical  representation  of evolutionary processes in biology. As data objects, they are characterized by the challenges associated with "big data," as well as the  complication that  their  discrete  geometric  structure  results  in  a  non-Euclidean phylogenetic  tree  space,  which  poses  computational  and   statistical limitations.

In this  talk, I  will compare  the geometric  and statistical  properties between a  well-studied framework  -  the BHV  space, and  an  alternative framework that  we  propose, which  is  based on  tropical  geometry.  Our framework exhibits analytic,  geometric, and  topological properties  that are desirable for  theoretical studies in  probability and statistics,  as well  as  increased  computational  efficiency.  I  also  demonstrate  our approach on an example of seasonal influenza data.

### Some combinatorics of RNA branching

Series
Mathematical Biology Seminar
Time
Wednesday, September 4, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Christine HeitschGeorgia Tech

Understanding the folding of RNA sequences into three-dimensional structures is one of the fundamental challenges in molecular biology.  For example, the branching of an RNA secondary structure is an important molecular characteristic yet difficult to predict correctly.  However, recent results in geometric combinatorics (both theoretical and computational) yield new insights into the distribution of optimal branching configurations, and suggest new directions for improving prediction accuracy.

### Local Immunodeficience: where do we stand.

Series
Mathematical Biology Seminar
Time
Wednesday, August 28, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Lyonia BunimovichGeorgia Tech

### Organizational meeting

Series
Mathematical Biology Seminar
Time
Wednesday, August 21, 2019 - 11:00 for 30 minutes
Location
Skiles 006
Speaker
Christine HeitschGeorgia Tech

A brief meeting to discuss the plan for the semester, followed by an informal discussion over lunch (most likely at Ferst Place).

### Stochastic models for the transmission and establishment of HIV infection

Series
Mathematical Biology Seminar
Time
Wednesday, March 27, 2019 - 10:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Dan CoombsUBC (visiting Emory)
The likelihood of HIV infection following risky contact is believed to be low. This suggests that the infection process is stochastic and governed by rare events. I will present mathematical branching process models of early infection and show how we have used them to gain insights into the duration of the undetectable phase of HIV infection, the likelihood of success of pre- and post-exposure prophylaxis, and the effects of prior infection with HSV-2. Although I will describe quite a bit of theory, I will try to keep giant and incomprehensible formulae to a minimum.

### Inference of evolutionary dynamics of heterogeneous cancer and viral populations

Series
Mathematical Biology Seminar
Time
Wednesday, February 27, 2019 - 11:01 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Pavel SkumsGSU/CDC

Inference of evolutionary dynamics of heterogeneous cancer and viral populations Abstract: Genetic diversity of cancer cell populations and intra-host viral populations is one of the major factors influencing disease progression and treatment outcome. However, evolutionary dynamics of such populations remain poorly understood. Quantification of selection is a key step to understanding evolutionary mechanisms driving cancer and viral diseases. We will introduce a mathematical model and an algorithmic framework for inference of fitness landscapes of heterogeneous populations from genomic data. It is based on a maximal likelihood approach, whose objective is to estimate a vector of clone/strain fitnesses which better fits the observed tumor phylogeny, observed population structure and the dynamical system describing evolution of the population as a branching process. We will discuss our approach to solve the problem by transforming the original continuous maximum likelihood problem into a discrete optimization problem, which could be considered as a variant of scheduling problem with precedent constraints and with non-linear cumulative cost function.

### Exploring the impact of inoculum dose on host immunity and morbidity to inform model-based vaccine design

Series
Mathematical Biology Seminar
Time
Wednesday, January 30, 2019 - 11:00 for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Andreas HandelUGA
Vaccination is an effective method to protect against infectious diseases. An important consideration in any vaccine formulation is the inoculum dose, i.e., amount of antigen or live attenuated pathogen that is used. Higher levels generally lead to better stimulation of the immune response but might cause more severe side effects and allow for less population coverage in the presence of vaccine shortages. Determining the optimal amount of inoculum dose is an important component of rational vaccine design. A combination of mathematical models with experimental data can help determine the impact of the inoculum dose. We designed mathematical models and fit them to data from influenza A virus (IAV) infection of mice and human parainfluenza virus (HPIV) of cotton rats at different inoculum doses. We used the model to predict the level of immune protection and morbidity for different inoculum doses and to explore what an optimal inoculum dose might be. We show how a framework that combines mathematical models with experimental data can be used to study the impact of inoculum dose on important outcomes such as immune protection and morbidity. We find that the impact of inoculum dose on immune protection and morbidity depends on the pathogen and both protection and morbidity do not always increase with increasing inoculum dose. An intermediate inoculum dose can provide the best balance between immune protection and morbidity, though this depends on the specific weighting of protection and morbidity. Once vaccine design goals are specified with required levels of protection and acceptable levels of morbidity, our proposed framework which combines data and models can help in the rational design of vaccines and determination of the optimal amount of inoculum.