Notes

Some of my research notes and longer project write-ups.

Explainers of My Work

Research note

The Logistic Bandit Problem

Why logistic bandits were not well understood theoretically, and how our recent results help explain the empirical performance of learning algorithms in this setting.

Based on my TMLR 2026 paper

Research note

An Information-Theoretic Lens on Bandits and RL

A non-technical introduction to the information-theoretic framework behind my thesis: bandits, regret, Thompson Sampling, and the information ratio.

Overview of my thesis framework and papers

Topic Notes

Research note

What are Bandits, and Why Does Thompson Sampling Work?

A note on bandits, UCB, Thompson Sampling, the Russo–Van Roy framework, and what later work revealed about frequentist regret.

From Bayesian information ratios to Feel-Good Thompson Sampling

Project Notes

Project note

Chess-GPT: Human-Like Chess with a Fine-Tuned Language Model

How a transformer trained on human chess games emergently learns the rules, strategy, and its own distinct playing style.

Supervised bachelor thesis project at KTH

Research Experience

Research experience

A Quarter at Stanford

A short reflection on my 2024 research visit to Stanford, where I finished one line of work and began another.

Visiting Student Researcher in Benjamin Van Roy’s group