Some projects I'm working on & think are beautiful/cool!
The mathematics of evolutionary trees
All of life makes sense in light of evolution, but how can we use evolutionary theory to inform how we analyze biological data?
The evolutionary tree, or phylogeny, can be a source of confounding correlations between species, but it can also serve as a scaffold to infer when & where important traits arose, thereby organizing how we understand biological systems.
How can we use this tree to simplify our analyses of communities, from tropical bird communities facing deforestation or viral communities infecting mammals, or bacterial communities called microbiomes?
Every branch in the tree separates our community into two groups: those below the branch, and those above the branch. By constructing variables that contrast these two groups of species, we can find the edges with the most-different species. I developed a machine learning algorithm - phylofactorization - that uses the evolutionary tree to make a change-of-variables that corresponds to cutting the tree's branches. This simplifies large communities by summarizing what's going on in terms of a few lineages capturing the most variance.
These inferences on the tree of life point to lineages that might have traits underlying their role in disease or ecosystem function. Consequently, phylofactorization is an algorithm that catalyzes data to insights, motivating further studies of microbial physiology and genome biology to understand what evolutionary event happened along the edges we found.
All pathogens come from somewhere. From coronaviruses in bats, influenzas in swine and birds, Lyme disease in rodents & ticks, malaria in primates & mosquitoes, and Ebola in reservoirs we're currently unsure of, pathogens that are novel to humans have age-old relationships with other animals.
I'm interested in everything from spillover to human epidemics. From percolation models of the risk of spillover, clever tricks to use syndromic surveillance to quantify the size of outbreaks of pathogens like SARS-CoV-2 in the US, the evolution of new strains like B 1.1.7, and tools to compare the world's epidemics on timescales of death. The figure you see here shows the excess of patients visiting the doctor with influenza-like illness during the COVID-19 epidemic in 2020.
My work in epidemiology builds on my work in finance. High-stakes forecasting requires high-quality data streams, fast & robust analyses, and knowing when you know enough to act and when you don't.