Multi-armed bandit with infinitely-many arms

600 Views Asked by Bumbble Comm At 28 Mar 2026 - 11:53

Has anyone studied variants of the multi-armed bandit algorithm with infinitely many arms?

I have a collection of distributions parametrized by an integer $n$. Unfortunately, I can't analytically determine which distribution will have the highest mean, but I have a guess that, say, a value of $n$ near 10 is best. (So, it would be very surprising if it turned out that, say, 1000 were better.) I can phrase this guess as a prior distribution; is there a way to use the prior in a bandit algorithm?

Pursuit algorithms can incorporate a prior, and it looks like they should work with infinitely many arms (for example, they don't start by sampling every arm).

Has anyone studied this extension? Or a different mutli-armed bandit algorithm with infinitely many arms?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 19 Nov 2015 - 7:13

This is a rather late answer, but hopefully still helpful. For bandit problems with infinitely many arms, there is some work. A good starting point might be $\chi$-Armed Bandits by Bubeck et al. You can find more references therein. Specifically, they deal with a variant where the bandit arms are in a generic measurable space, and the reward function has certain "smoothness" properties. They HOO algorithm they propose is also not too hard to implement if you want to try it out.

Multi-armed bandit with infinitely-many arms

There are 1 best solutions below

Related Questions in ALGORITHMS

Related Questions in MONTE-CARLO

Trending Questions

Popular # Hahtags

Popular Questions