Non-negative CP tensor decomposition to identify response signatures in omics time-course experiments

Mathematical Biology Seminar
Wednesday, April 27, 2022 - 10:00am for 1 hour (actually 50 minutes)
Anna Konstorum – Yale University –
Bo Lin

A central goal of biological experiments that generate omics time-course data is the discovery of patterns, or signatures, of response. A natural representation of such data is in the form of a third-order tensor. For example, if the dataset is from a bulk RNASeq experiment, which measures tissue-level gene expression collected at multiple time points, the data can be structured into a gene-by-subject-by-time tensor. We consider the use of a non-negative CANDECOMP/PARAFAC (CP) decomposition (NCPD) on the tensor to derive rank-one components that correspond to biologically meaningful signatures.  To assess whether over-factoring has occurred in a model, we develop the maximum internal n-similarity score (mINS) score. We use the mINS as well as other metrics to choose a model rank for downstream analysis. We show that on time-course data profiling vaccination responses against the Influenza and Bordetella Pertussis pathogens, our NCPD pipeline yields novel and informative signatures of response. We finish with outstanding research challenges in the application of tensor decomposition to modern biological datasets.