- This event has passed.
Seminar: Samantha Petti, Harvard University, Probability and combinatorics in the tree of life: How stochastic processes create functional biological sequences
January 10 | 4:15 pm - 5:15 pm EST
The stochastic processes of evolution have generated DNA, RNA, and protein sequences. These sequences determine how these entities chemically interact with themselves and each other, form physical structures, and functionally behave as signals and/or machines within cells. My research involves reconstructing the history of the stochastic processes that led to the sequences we observe today and developing methods to make biological predictions from sequences.
I will describe a new method for arranging related protein or RNA sequences into Multiple Sequence Alignments (MSAs) where each column corresponds to a position in an unobserved ancestral sequence. The patterns of conservation and covariation in an MSA can be leveraged for a variety of downstream tasks, including structure prediction. Generating an MSA is typically treated as a preprocessing step. Instead, we designed a method that learns an MSA jointly with a downstream machine learning task. To do so, we reformulated a classic dynamic
programming algorithm so that it outputs a probability distribution over alignments and is differentiable. Using contact prediction as a case study, we show that it is possible to learn MSAs in this manner, and for
many protein and RNA families doing so improves the accuracy of the predicted contacts. I will also briefly discuss (i) proper benchmarking of methods involving biological sequence data using graph algorithms,
and (ii) predicting how the fitness of yeast emerges as a function of its DNA sequence.