News

2025/01 Our paper Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery is accepted for IEEE transactions on Information Theory.

2024/06 I start to intern at Apple ASM group. Thanks a lot to Frank and Minda for hosting me.

2024/06 Our paper A Circuit Approach to Constructing Blockchains on Blockchains is accepted at AFT’24. Too long; Don't read (TLDR): we build a more secure overlay blockchain by reading from and writing to a given set of blockchains.

2024/03 Our paper A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features is available on arxiv. Too long; Don't read (TLDR): training a neural network on 1D dataset is equivalent to solving a Lasso problem. This extends to deep neural networks up to 4 layer.

2024/02 Our paper A Circuit Approach to Constructing Blockchains on Blockchains is available on arxiv. TLDR: we build a more secure overlay blockchain by reading from and writing to a given set of blockchains.

2023/10 Our paper Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes is available on arxiv. TLDR: we use max-cut and zonotope to provide a classification on the difficulty of training two-layer ReLU neural networks.

2023/9 I finished my internship at Babylon Chain. Thanks a lot to David and Sankha for hosting me. My main responsibilities encompassed user behavior analysis of Bitcoin, an extensive survey on the Proof of Stake (PoS) liquid staking model and their related incentive programs, and a thorough tokenomics investigation of emerging blockchains, including Akash.

2023/4 Our paper Sketching the Krylov Subspace: Faster Computation of the Entire Ridge Regularization path with code is accepted for the Journal of Supercomputing 2023. Too long; Don't read (TLDR): we use polynomial expansion and iterative Hessian sketch to compute the entire regularzation path of ridge regression.

2023/3 I update on bits-back-coding with rANS as final project for EE274 Data Compression. The codes and report are available at here.

2023/2 Our paper A Decomposition Augmented Lagrangian Method for Low-rank Semidefinite Programming is to appear on SIAM on Optimization (2023). We provide a fast solvers for general non-smooth semi-definite programs with low-rank structure.

2023/1 Our paper Parallel Deep Neural Networks Have Zero Duality Gap is accepted for 2023 as a poster. In short, we make the convex duality gap for deep neural networks become zero by considering the parallel neural network structure. To a parallel neural network with m branches, the output is the linear combinations of the outputs of m standard fully connected neural networks.

2022/12 I implement the range Bits Back Coding as the final project for EE274 Data Compression. TL;DR: We compress the multi-set to the information limit! Checkout the report and code!

2022/11 I write a short tutorial toward understanding neural networks via the lens of convex optimization. Check it out at here.

2022/09 Our new paper Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery is available on Arxiv and the codes are available on Github. We resolve the discrepancy between the remarkable generalization and model complexity from a convex optimization and sparse recovery perspective. Under certain regularity assumptions on the data, we show that ReLU networks with an arbitrary number of parameters learn only simple models that explain the data.

2022/09 I gave a presentation of our preprint on Optimal Neural Network Approximation of Wasserstein Gradient Direction via Convex Optimization at SIAM MDS22 at San Deigo. Check out our slide! TL;DR: How Wasserstein Gradient Flow meets Neural Networks and Convex Optimization?

2022/09 Our paper Beyond the Best: Distribution Functional Estimation in Infinite-Armed Bandits is accepted in NeurIPS 2022! We provide offline and online sample complexity of estimating mean, quantile, trimmed mean and maximum from noisy observations. We develop a unified meta algorithms and prove the general information-theoretical lower bounds for both offline and online sampling. Check out for the paper, slide and poster.