Notes
Some of my research notes and longer project write-ups.
Explainers of My Work
Research note
The Logistic Bandit Problem
Why logistic bandits were not well understood theoretically, and how our recent results help explain the empirical performance of learning algorithms in this setting.
Research note
An Information-Theoretic Lens on Bandits and RL
A non-technical introduction to the information-theoretic framework behind my thesis: bandits, regret, Thompson Sampling, and the information ratio.
Topic Notes
Research note
What are Bandits, and Why Does Thompson Sampling Work?
A note on bandits, UCB, Thompson Sampling, the Russo–Van Roy framework, and what later work revealed about frequentist regret.
Project Notes
Project note
Chess-GPT: Human-Like Chess with a Fine-Tuned Language Model
How a transformer trained on human chess games emergently learns the rules, strategy, and its own distinct playing style.
Research Experience
Research experience
A Quarter at Stanford
A short reflection on my 2024 research visit to Stanford, where I finished one line of work and began another.