Multi-armed Bandits
A demo for the case of the k-armed bandit problem
Click numbered buttons to play one machine manually
1
2
3
4
5
6
7
8
9
10
Set policy
Random
Greedy
ε-Greedy
Other controls
Reset
Toggle True Probabilities
Run active policy X episodes
1
10
100
1k
10k
100k
Active Policy =
Average Returns =
Total Episodes =
Save and start new run
Toggle Optimal Returns