I'm a senior PhD candidate in the Department of Computer Science at Stanford, advised by Professor Anshul Kundaje. I develop broadly-applicable methods to make deep learning models interpretable, and have applied these methods to study regulatory genomics. In order of my excitement, my PhD projects are:
- TF-MoDISco, a method that leverages input-level importance scores produced by deep learning models or SVMs to reveal recurring patterns in genomic data.
- DeepLIFT, a computationally efficient method for generating input-level importance scores to explain individual predictions made by a deep learning model.
- Gkmexplain, a computationally efficient method for generating input-level importance scores to explain individual predictions made by nonlinear gapped-kmer support vector machines.
- Bias-Corrected Temperature Scaling, a calibration algorithm that enables domain adaptation to label shift using Expectation Maximization, outperforming recently-proposed methods.
- Selective classification via curve optimization, a method for deciding which examples a predictive model should abstain on to optimize rank-based metrics such as auPRC.
- Separable fully-connected layers, a novel architectural component for deep learning models that takes advantage of recurring positional patterns in genomic data.
- Reverse-complement parameter sharing, a novel architectural component for deep learning models that takes advantage of symmetries induced by the double-stranded nature of DNA.
- simDNA, a package for generating simulated regulatory genomic sequences.
I have a Bachelor's in Computer Science with Molecular Biology from MIT and spent a year working as a developer for the Healthcare team of Palantir Technologies before starting my PhD.