|
Sophia Lu
Research Interests
Efficient sampling algorithms for black-box models, generative modeling and synthetic data, distributional regression, probabilistic modeling and Bayesian methods, approximate Bayesian inference, and their applications to Bioengineering, Genetics and Genomics, Astrophysics, Econometrics, etc.
About
I am is a fifth-year Ph.D. candidate in the Department of Statistics at Stanford University, where I am fortunate to be advised by Wing H. Wong.
Broadly, my research interests lie at the intersection of efficient Bayesian inference and the development of robust, interpretable, and theoretically grounded statistical machine learning methods. My work aims to bridge the gap between advanced statistical methodologies and real-world applications by delivering reliable guarantees, with the goal of advancing principled statistical foundations for scientific discovery.
Before pursuing my doctoral studies, I graduated with honors in Mathematical and Computational Science and a minor in Mathematics from Stanford University. My research is supported by the Stanford Data Science Graduate Fellowship, the Two Sigma Graduate Fellowship Fund, and Google Cloud Academic Research Grant (Co-PI), and Stanford HAI-Google Cloud Credits Grant (Co-PI).
Research
Publications and Preprints
Univariate-Guided Sparse Regression for Biobank-Scale High-Dimensional Omics Data
Joshua Richland, Tuomo Kiiskinen, William Wang, Wenhui Sophia Lu, Balasubramanian Narasimhan, Trevor Hastie, Manuel Rivas, Robert Tibshirani
Steering Protein Generative Models at Test-Time for Guided AAV2 Capsid
Design
Ben Viggiano*, Wenhui Sophia Lu*, Xiaowei Zhang*, Luis Santiago Mille-Fragoso, Xiaojing J Gao, Euan Ashley, Wing Hung Wong
Accepted in Proceedings & Oral at Pacific Symposium on Biocomputing (PSB), 2026
ProVADA: Generating Subcellular Protein Variants via Ensemble-Guided Test-Time Steering
Wenhui Sophia Lu*, Xiaowei Zhang*, Luis Santiago Mille-Fragoso, Haoyu Dai, Xiaojing J Gao, Wing Hung Wong
Spotlight & Oral at Generative AI for Biology Workshop, ICML
Likelihood-Free Adaptive Bayesian Inference via Nonparametric Distribution Matching
Wenhui Sophia Lu, Wing Hung Wong
Generative Modeling for Tabular Data via Penalized Optimal Transport Network
Wenhui Sophia Lu*, Chenyang Zhong*, Wing Hung Wong
Package available here
Submitted
Comparison of REML methods for the study of phenome-wide genetic variation
Damian Pavlyshyn, Wenhui Sophia Lu, Iain M. Johnstone, and Jacqueline L. Sztepanacz
Under revision at Genetics
Sc-compReg enables the comparison of gene regulatory networks between conditions using single-cell data
Zhana Duren*, Wenhui Sophia Lu*, Joseph G. Arthur, Preyas Shah, Jingxue Xin, Francesca Meschi, Miranda Lin Li, Corey M. Nemec, Yifeng Yin, and Wing Hung Wong
Nature Communications, 2021
* indicates equal contribution
Invited and Contributed Talks
Likelihood-Free Adaptive Bayesian Inference via Nonparametric Distribution Matching
Efficient Likelihood-Free Adaptive Bayesian Inference
Efficient Generative Modeling via Penalized Optimal Transport Network
ProVADA: Generation of Subcellular Protein Variants via Ensemble-Guided Test-Time Steering
Likelihood-Free Adaptive Bayesian Inference
An Introduction to Approximate Bayesian Computation and Likelihood-Free Inference
Guest lectures (3 sessions), Applied Bayesian Statistics | STATS 371, Stanford University, May 2025.
Towards faithful synthetic data generation via penalized optimal transport network
Modern Bayesian Modeling and Adaptive Bayesian Inference
Guest lecture, Topics in Computing for Data Science | STATS/BIODS 352, Stanford University, Apr 2025.
Towards faithful synthetic data generation via penalized optimal transport network
Towards faithful synthetic data generation via penalized optimal transport network
Modern Bayesian modeling and adaptive Bayesian inference
Selected Presentations
Efficient Likelihood-Free Adaptive Bayesian Inference
ProVADA: Generation of Subcellular Protein Variants via Ensemble-Guided Test-Time Steering
Stanford Bio-X Interdisciplinary Initiatives Seed Grants Poster Session, Aug 2025.
Selected as one of the 11 Rank 1 poster awards (among 336 poster submissions).
Efficient Generative Modeling via Penalized Optimal Transport Network
Efficient Generative Modeling via Penalized Optimal Transport Network
The Past, Present & Future of Statistics in the Era of AI, The George Washington University, May 2025.
Gratefully supported by travel award from NSF.
Efficient Generative Modeling via Penalized Optimal Transport Network
Optimization and Statistical Learning Workshop, Columbia University, Apr 2025.
Gratefully supported by travel award from NSF.
Efficient Generative Modeling via Penalized Optimal Transport Network
Statistics and Optimal Transport Workshop, Columbia University, Mar 2025.
Gratefully supported by travel award.
Efficient Generative Modeling via Penalized Optimal Transport Network
Contact
sophialu (at) stanford (dot) edu
|