I am a researcher at KTH Royal Institute of Technology in Stockholm, working on reinforcement learning and information theory. I completed my PhD in December 2025 under the supervision of Mikael Skoglund and Tobias Oechtering, supported by the WASP Graduate School. My thesis is titled An Information-Theoretic Approach to Bandits and Reinforcement Learning.

My research focuses on decision problems under uncertainty and establishing performance guarantees for learning algorithms — studying how efficiently agents acquire and exploit information in bandit and RL settings. I am particularly interested in Thompson Sampling, regret analysis, and information-directed exploration. My work combines theoretical analysis with empirical experimentation and has been presented at ICML, ISIT, NeurIPS, and published in TMLR.

In 2024, I spent four months as a Visiting Student Researcher at Stanford University working with Professor Benjamin Van Roy. In Spring 2025, I did a research internship at Lynx Asset Management, developing reinforcement-learning methods for optimal trade execution.

I welcome discussions with anyone interested in reinforcement learning, information theory, or sequential decision-making. Feel free to reach out!

Publications

Amaury Gouverneur, Tobias J. Oechtering, and Mikael Skoglund,
An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandit Problems
TMLR 2026 | pdf
Resolves a conjecture on logistic bandits by removing exponential dependence on the logistic slope.

Amaury Gouverneur, Tobias J. Oechtering, and Mikael Skoglund,
Refined PAC-Bayes Bounds for Offline Bandits
ISIT 2025 | pdf
Derives state-of-the-art PAC-Bayes bounds for offline bandits via a new optimization technique.

Raghav Bongole, Amaury Gouverneur, Tobias J. Oechtering, and Mikael Skoglund,
Information-Theoretic Minimax Regret Bounds for Reinforcement Learning Problems
Submitted to ITW 2025 | pdf

Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, and Mikael Skoglund,
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces
ICASSP 2025 | pdf
Extends information-theoretic regret analysis of Thompson Sampling to continuous action spaces.

Raghav Bongole, Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, and Mikael Skoglund,
Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality
ICASSP 2025 | pdf

Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, and Mikael Skoglund,
Chained Information-Theoretic Bounds and Tight Regret Rate for Linear Bandit Problems
ICML 2024 (FoRLaC Workshop) | arXiv | pdf
Achieves the optimal regret rate for linear bandits using a chaining-based analysis.

Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, and Mikael Skoglund,
Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian Rewards
ISIT 2023 | arXiv | pdf | conference pdf
Extends the Russo–Van Roy information-theoretic framework to contextual bandit problems.

Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, and Mikael Skoglund,
An Information-Theoretic Analysis of Bayesian Reinforcement Learning
Allerton 2022 | arXiv | pdf | conference pdf

Antoine Aspeel, Amaury Gouverneur, Raphaël M. Jungers, and Benoit Macq,
Optimal Intermittent Particle Filter
IEEE Transactions on Signal Processing 2022 | arXiv | pdf | journal pdf

Amaury Gouverneur,
Optimal Measurement Times for Particle Filtering and its Application in Mobile Tumor Tracking
Master Thesis 2022, supervised by Benoit Macq | dial | pdf

Antoine Aspeel, Amaury Gouverneur, Raphaël M. Jungers, and Benoit Macq,
Optimal Measurement Budget Allocation for Particle Filtering
ICIP 2020 | arXiv | pdf | conference pdf

Teaching

Project in Multimedia Processing and Analysis, EQ2445 at KTH — 2024

Machine Learning and Data Science, EQ2415 at KTH — 2024

Pattern Recognition and Machine Learning, EQ2341 at KTH — 2020–2024

Deep Neural Networks, EP232U at KTH — Spring 2022

Service

Reviewer for ICML, ICLR, ISIT, ICASSP, and EUSIPCO.

WASP Cluster leader for Mathematical Foundations of AI other than ML (2020–2024) and Sequential Decision-Making and Reinforcement Learning (current).

Supervision

Bachelor theses

Master theses