Q-Learning Lagrange Policies for Multi-Action Restless Bandits

Publication information:

Killian, J., Biswas, A., Shah, S. & Tambe, M. Q-Learning Lagrange Policies for Multi-Action Restless Bandits. in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2021) (2021).