alt text 

About Me

I am a postdoctoral researcher in the Stanford CS department, working with Chris Ré.

Previously, I was a short-term postdoc at in the UCLA CS and EE departments, working with Guy Van den Broeck and Lara Dolecek. I received my Ph.D. in electrical engineering from UCLA in December 2016. Before joining UCLA, I studied Electrical Engineering at the University of Michigan, Ann Arbor.

My CV and my Google Scholar profile.

My dissertation, Algorithms and Coding Techniques for Reliable Data Management and Storage (Outstanding Ph.D. Dissertation in Signals & Systems Award, UCLA EE Dept.) and my thesis, Novel Coding Strategies for Multi-Level Non-Volatile Memories (Edward K. Rice Outstanding Masters Student Award, UCLA HSSEAS, Outstanding M.S. Thesis Award, UCLA EE Dept.).

What's New

  • 11-5: Our work on multi-task weak supervision via low-rank matrix completion was accepted to AAAI 2019!

  • 9-1: I received a top 200 reviewer award (along with a free registration!) to NeurIPS 2018.

  • 8-14: I joined the Hazy Research group in the Stanford CS department as a postdoc.

Research Interests

I am broadly interested in problems related to machine learning, statistical inference, and information and coding theory. I often work on the analysis and design of algorithms that must operate on unreliable (incomplete, noisy, corrupted) data.

My approach to these problems involves tools from many disciplines: information theory, statistics, differential geometry, combinatorics, and optimization. Projects I have worked on include:

  • Geometry and structure of data. Modern ML methods require first embedding data into a continuous space — traditionally Euclidean space. However, the structure of data makes Euclidean space unsuitable for many types of structured data (like hierarchies!) We show that non-Euclidean spaces like hyperbolic space (and other manifolds!) are more suitable for embeddings and study the limits and tradeoffs of these techniques in our ICML ’18 and ICLR ’19 papers.

  • ML algorithms and noisy test data. What is the impact of noisy test data on learning algorithms? How can we make these algorithms robust to corrupted or incomplete data? Given a limited error-protection budget, how should we choose to protect features to limit divergence from the ideal, noiseless result? Some of our results for classification and linear regression.

  • Efficient data synchronization and reconstruction. What is the least amount of information we must exchange to synchronize between two versions of a file, or to reconstruct a core piece of data from noisy samples? My work studies bounds and algorithms for these techniques in IT ’17 and TCOM ’16.

  • Reliable data storage in next-gen memories: New memories have revolutionized the world of storage with their speed and power efficiency. However, modern memories suffer from specific physical limitations that lead to errors and corruption. Novel reliability and error-correcting techniques are critical to the future of these devices. My work develops new data representations (TCOM ’13), new coding techniques (Comm. Letters ’14), and making algorithms more robust (TCOM ’17). I am also interested in theoretical frameworks (SELSE ’16 best of) to evaluate broad ranges of error-correction techniques.

I am also very interested in scientific writing and communication. I strongly believe in the importance of clearly and effectively communicating research ideas to a broad and popular audience. Together with my advisor, I have written a book on channel coding for non-volatile memories, a book chapter on advanced error-correction techniques for 3D flash memories, and an expository article on dealing with flash deficiencies.