Q-Learning Lagrange Policies for Multi-Action Restless Bandits

Publication information:

Killian, J., Biswas, A., Shah, S. & Tambe, M. Q-Learning Lagrange Policies for Multi-Action Restless Bandits. in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2021) (2021).

- BibTeX
- EndNote X3 XML
- EndNote 7 XML
- Endnote tagged
- Marc
- PubMedId
- RIS
Publisher's Version