Publications | Nima Hamidi

January 2021 arXiv preprint arXiv:2102.07987

The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling

In this note, we introduce a randomized version of the well-known elliptical potential lemma that is widely used in the analysis of algorithms in sequential learning and decision-making problems such as stochastic linear bandits.

PDF

Nima Hamidi, Mohsen Bayati

June 2020 arXiv preprint arXiv:2006.06790

On Worst-case Regret of Linear Thompson Sampling

In this paper, we consider the worst-case regret of Linear Thompson Sampling (LinTS) for the linear bandit problem. Russo and Van Roy (2014) show that the Bayesian regret of LinTS is bounded above by $\widetilde{\mathcal{O}}(d\sqrt{T})$ where $T$ is the time horizon and $d$ is the number of parameters.

PDF

Mohsen Bayati, Nima Hamidi, Ramesh Johari, Khashayar Khosravi

February 2020 Advances in Neural Information Processing Systems (spotlight)

The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

We study the structure of regret-minimizing policies in the many-armed Bayesian multi-armed bandit problem: in particular, with $k$ the number of arms and $T$ the time horizon, we consider the case where $k \geq \sqrt{T}$.

PDF

Nima Hamidi, Mohsen Bayati

February 2020 arXiv preprint arXiv:2002.05152

A General Framework to Analyze Stochastic Linear Bandit

In this paper, we study the well-known stochastic linear bandit problem where a decision-maker sequentially chooses among a set of given actions in $\mathbb{R}^d$, observes their noisy linear reward, and aims to maximize her cumulative expected reward over a horizon of length $T$.

PDF

Nima Hamidi, Mohsen Bayati, Kapil Gupta

November 2019 Advances in Neural Information Processing Systems

Personalizing Many Decisions with High-Dimensional Covariates

We consider the $k$-armed stochastic contextual bandit problem with $d$ dimensional features, when both $k$ and $d$ can be large. To the best of our knowledge, all existing algorithm for this problem have a regret bound that scale as polynomials of degree at least two in $k$ and $d$.

PDF

Nima Hamidi, Mohsen Bayati

August 2019 arXiv preprint arXiv:1904.08576

On Low-rank Trace Regression Under General Sampling Distribution

In this paper, we study the trace regression when a matrix of parameters $\mathbf{B}^\star$ is estimated via convex relaxation of rank-penalized regression or non-convex optimization. It is known that these estimators satisfy near optimal error bounds under assumptions on rank, coherence, or spikiness of $\mathbf{B}^\star$.

PDF