I am interested in designing learning algorithms and data collection procedures that can lead to machine learning (ML) systems that are guaranteed to be safe and fair when deployed in the real world.
More concretely, I am currently thinking about the following problems.
- Several state-of-the-art machine learning classifiers fail on adversarial inputs – carefully constructed small perturbations of natural inputs. In high-stakes applications like self driving cars, military and medicine, it's imperative to have reliable defenses. I am working on providing principled and provable methods of obtaining classifiers that are robust to such examples by design.
- When we use ML to make decisions about providing loans, criminal release etc., we need to ensure that our decisions do not have unintended discriminatory consequences. I am currently exploring the formalism of what it means to make sure that classifiers do not make systematic mistakes, and how this interacts with the traditional goal of maximising accuracy.
On a related note, I have previously worked on extrapolating properties of the unseen parts of the distribution.
How much training data do we need to collect to ensure that with high probability, every element we see at deployment is already seen during training?
On the theoretical side, I am interested in non-convex optimization and understanding the conditions under which local methods can solve non-convex objectives
arising in Machine Learning problems. I have previously worked on providing guarantees for learning mixture of gaussians from streaming data. I am also excited about new methods to circumvent computational challenges of non-convexity. As an undergraduate, I worked on using method of moments to perform computationally efficient estimation under indirect supervision.