Beyond “To Act or Not to Act”: Fast Lagrangian Approaches to General Multi-Action Restless Bandits

Citation:

Killian JA, Perrault A, Tambe M. Beyond “To Act or Not to Act”: Fast Lagrangian Approaches to General Multi-Action Restless Bandits. IJCAI 2021 Workshop on AI for Social Good. 2021.

Abstract:

We present a new algorithm and theoretical results for solving Multi-action Multi-armed Restless Bandits, an important but insufficiently studied generalization of traditional Multi-armed Restless Bandits (MARBs). Multi-action MARBs are capable of handling critical problem complexities often present in AI4SG domains like anti-poaching and healthcare, but that traditional MARBs fail to capture. Limited previous work on Multi-action MARBs has only been specialized to sub-problems. Here we derive BLam, an algorithm for general Multi-action MARBs using Lagrangian relaxation techniques and convexity to quickly converge to good policies via bound optimization. We also provide experimental results comparing BLam to baselines on a simulated distributions motivated by a real-world community health intervention task, achieving up to five-times speedups over more general methods without sacrificing performance.