Multi-armed Bandits

A demo for the case of the k-armed bandit problem

Click numbered buttons to play one machine manually

Set policy

Other controls

Run active policy X episodes

Active Policy =

Average Returns =

Total Episodes =