Education
Stanford
2014-2020 Ph.D. |
Ph.D. in Computer Science, applying deep learning to study regulatory genomics
Ph.D. Advisor: Anshul Kundaje. |
MIT
2009 - 2013 B.S. |
B.S. Computer Science with Molecular Biology, Minor in Mathematics
Undergraduate GPA: 5.0/5.0 |
Experience
Kundaje Lab
Sep 2014 - Sep 2020 |
|
Palantir Technologies
June 2013 - Sep 2014 |
Forward Deployed Engineer for the Healthcare Team.
|
Selected Talks
ICML 2020 July 2020 |
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Domain Adaptation Video here. |
ISMB 2019 July 2019 |
GkmExplain: Fast and Accurate Interpretation of Nonlinear Gapped k-mer SVMs Video here. |
Biological Data Science Nov 2018 |
Suggested best practices for interpreting deep learning models of regulatory DNA Slides here. |
NVIDIA GTC March 2018 |
Not Just a Black Box: Interpretable Deep Learning for Genomics and Beyond Video here. |
NIPS MLCB Dec 2017 |
TF-MoDISco: Deep learning non-redundant, predictive sequence motifs of transcription factors Video here. |
ICML 2017 Aug 2017 |
Learning Important Features Through Propagating Activation Differences Video and slides here and here, paper here. |
Deep Learning In Healthcare Summit, Boston May 2017 |
Not Just a Black Box: Interpretable Deep Learning for Genomics Interview here. |
CEHG Symposium March 2016 |
Not Just a Black Box: Interpretable Deep Learning for Genomics Video here. |
Selected Publications
ICML Workshop on Computational Biology July 2020 |
Look at the Loss: Towards Robust Detection of False Positive Feature Interactions Learned by Neural Networks on Genomic Data. Mara Finkelstein*, Avanti Shrikumar*† Anshul Kundaje† *co-first authors † co-corresponding authors Novel strategy to detect when feature interactions learned by a neural network may be false positives, by looking at the impact that the learned interaction effect has on the model's prediction loss on held-out data. |
ICML 2020 July 2020 |
Maximum Likelihood With Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation. Amr Alexandari*, Anshul Kundaje†, Avanti Shrikumar*† *co-first authors † co-corresponding authors Algorithm that gives state-of-the-art results at a problem called domain adaptation to label shift, which arises when adapting a trained classifier to perform well in a scenario where the class proportions are different compared to when the classifier was first trained (e.g. adapting a disease predictor to account for a surge in cases due to a pandemic) |
Proceedings of the ISMB July 2019 |
Gkmexplain: Fast and Accurate Interpretation of Nonlinear Gapped k-mer SVMs. Avanti Shrikumar*†, Eva Prakash*, Anshul Kundaje† *co-first authors † co-corresponding authors Computationally efficient algorithm for explaining individual predictions made by nonlinear gapped-kmer SVMs trained on genomic sequences. |
Nature Biotechnology May 2019 |
Kipoi: accelerating the community exchange and reuse of predictive models for genomics. Žiga Avsec*†, Roman Kreuzhuber*, Johnny Israeli, Nancy Xu, Jun Cheng, Avanti Shrikumar, Abhimanyu Banerjee, Daniel S. Kim, Lara Urban, Anshul Kundaje†, Oliver Stegle†, Julien Gagneur† *co-first authors † co-corresponding authors Model zoo for genomics. I was involved in designing the API and converted the DeepBind models to Keras using the code here |
arXiv July 2018 |
Computationally Efficient Measures of Internal Neuron Importance. Avanti Shrikumar*†, Jocelin Su*, Anshul Kundaje†. *co-first authors † co-corresponding authors Showed an equivalence between Total Conductance, a recently-proposed method for computing internal neuron importance, and Path Intergrated Gradients, thereby providing a computationally efficient way to compute the former. The reformulation of Total Conductance was referred to as Neuron Integrated Gradients. Benchmarked Neuron Integrated Gradients against DeepLIFT. Colab notebook reproducing results here |
Journal of the Royal Society Interface April 2018 |
Opportunities And Obstacles For Deep Learning In Biology And Medicine Collaboratively written review on deep learning for biology and medicine. I wrote the section on interpretation. Note that the final published version was stripped down due to word limits. I have linked to the original submission from my end. |
arXiv Feb 2018 |
A Flexible and Adaptive Framework for Abstention Under Class Imbalance. Avanti Shrikumar*†,Amr Alexandari*, Anshul Kundaje†. *co-first authors † co-corresponding authors Proposes a framework for identifying which examples to abstain on in order to optimize for a specific metric of interest. Leverages the insight that because the calibrated probabilities can be used as a proxy for the true label, optimization is possible even when the ground-truth labels are not known. Derived computationally efficient algorithms for optimizing auROC, sensitivity at a target specificity and weighted cohen's kappa. Showed that by leveraging strategies for domain adaptation to label shift, the abstention algorithms can apply even in situations where the test-set distribution has a different class imbalance compared to the training-set distribution. |
ICML Workshop on Computational Biology (Spotlight Talk, Best Poster) June 2017 |
Separable Fully-Connected Layers Improve Deep Learning Models For Genomics. Amr Alexandari*, Avanti Shrikumar*, Anshul Kundaje. *co-first authors Adapts deep learning models for genomics by leveraging known patterns in transcription factor binding data. |
ICML April 2017 |
Learning Important Features Through Propagating Activation Differences. Avanti Shrikumar, Peyton Greenside, Anshul Kundaje. Details a computationally efficient algorithm to explain individual predictions of a deep learning model by assigning contribution scores to individual parts of the input. Code here. |
BioRxiv January 2017 |
Reverse-Complement Parameter Sharing Improves Deep Learning Models For Genomics. Avanti Shrikumar*, Peyton Greenside*, Anshul Kundaje. *co-first authors Adapts deep learning models for genomics by leveraging the reverse-complement property of DNA sequence. |
Circulation Research
Dec 2014 |
Transcriptional Reversion of Cardiac Myocyte Fate During Mammalian Cardiac Regeneration. O'Meara CC, Wamstad JA, Gladstone RA, Fomovsky GM, Butty VL, Shrikumar A, Gannon JB, Boyer LA, Lee RT Collaboration between Boyer lab at MIT and Lee lab at Harvard. I analysed RNA-seq data to study transcriptional reversion. |
Cell Sep 2012 |
Dynamic and coordinated epigenetic regulation of developmental transitions in the cardiac lineage. Wamstad JA, Alexander JM, Truty RM, Shrikumar A, Li F, Eilertson KE, Ding H, Wylie JN, Pico AR, Capra JA, Erwin G, Kattman SJ, Keller GM, Srivastava D, Levine SS, Pollard KS, Holloway AK, Boyer LA†, Bruneau BG†. † co-corresponding authors Collaboration between Boyer lab at MIT and Gladstone Institutes. I performed the bulk of bioinformatics analysis at the Boyer lab. 355 citations as of Jan 2018. |
Recognition
HHMI International Student Research Fellowship 2016 |
Awarded to 20 international students. Announcement here. |
Stanford Bio-X Fellowship 2016 |
Bio-X fellowships are awarded to about 25 students annually for interdisciplinary research. Announcement here. |
Microsoft Women's Fellowship 2016 |
Awarded to one woman per participating University pursuing or interested in pursuing a PhD. Announcement here. |
Outstanding Research Award Spring 2013 |
Awarded to 3 projects completed as part of MIT's SuperUROP program. My project was done in the Kellis lab. Announcement here. |
Sophomore Academic Excellence Award Fall 2011 |
AIChE Sophomore Academic Excellence Award for the student with the highest GPA among chemical engineers after sophomore year at MIT. Announcement here. |
IGCSE Examinations June 2006 & 2007 |
The IGCSE was administered in roughly 300 schools in India. I had the highest score in India in Extended Mathematics Without Coursework (June 2006; press release), Physics (June 2007) and Geography (June 2007). |
Selected Coursework
Stanford | Probabilistic Graphical Models | CS 228 | Winter 2015 | A+ |
Stanford | Machine Learning | CS 229 | Fall 2015 | A |
MIT | Statistics for Applications | 18.443 | Spring 2013 | A |
MIT | Advanced Computational Biology | 6.878 | Fall 2012 | A+ |
MIT | Design and Analysis of Algorithms | 6.046 | Spring 2012 | A |
MIT | Software Construction | 6.005 | Spring 2012 | A+ |
MIT | Evolutionary Biology | 7.33 | Spring 2012 | A+ |