Public Health

2021
Killian JA, Biswas A, Shah S, Tambe M. Q-Learning Lagrange Policies for Multi-Action Restless Bandits, in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2021). ; 2021. arXiv
Biswas A, Aggarwal G, Varakantham P, Tambe M. Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare, in Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021). ; 2021 :4036-4049. Publisher's Version
Biswas A, Aggarwal G, Varakantham P, Tambe M. Learning Index Policies for Restless Bandits with Application to Maternal Healthcare, in Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021). ; 2021. Publisher's Version
Zhang H, Dullerud N, Seyyed-Kalantari L, Morris Q, Joshi S, Ghassemi M. An empirical framework for domain generalization in clinical settings, in ACM Conference on Health, Inference, and Learning. Virtual Event, UA ; 2021.Abstract
Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing nonhealthcare benchmarks. We find that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic induced-shift scenarios in clinical time series data exhibit limited performance gains. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.
zhang_et_al._-_2021_-_an_empirical_framework_for_domain_generalization_i.pdf
Ou H-C, Chen H, Jabbari S, Tambe M. Active Screening for Recurrent Diseases: A Reinforcement Learning Approach. 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). 2021.Abstract
Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease and rapid spreading, the goal is to minimize the number of infections over a long time horizon. Active screening can be formalized as a sequential combinatorial optimization over the network of people and their connections. The main computational challenges in this formalization arise from i) the combinatorial nature of the problem, ii) the need of sequential planning and iii) the uncertainties in the infectiousness states of the population.
ou_et_al._-_2021_-_active_screening_for_recurrent_diseases_a_reinfor.pdf
Killian JA, Perrault A, Tambe M. Beyond “To Act or Not to Act”: Fast Lagrangian Approaches to General Multi-Action Restless Bandits. IJCAI 2021 Workshop on AI for Social Good. 2021.Abstract
We present a new algorithm and theoretical results for solving Multi-action Multi-armed Restless Bandits, an important but insufficiently studied generalization of traditional Multi-armed Restless Bandits (MARBs). Multi-action MARBs are capable of handling critical problem complexities often present in AI4SG domains like anti-poaching and healthcare, but that traditional MARBs fail to capture. Limited previous work on Multi-action MARBs has only been specialized to sub-problems. Here we derive BLam, an algorithm for general Multi-action MARBs using Lagrangian relaxation techniques and convexity to quickly converge to good policies via bound optimization. We also provide experimental results comparing BLam to baselines on a simulated distributions motivated by a real-world community health intervention task, achieving up to five-times speedups over more general methods without sacrificing performance.
Beyond “To Act or Not to Act”: Fast Lagrangian Approaches to General Multi-Action Restless Bandits
Biswas A, Aggarwal G, Varakantham P, Tambe M. Learning Restless Bandits in Application to Call-based Preventive Care Programs for Maternal Healthcare, in IJCAI 2021 Workshop on AI for Social Good. ; 2021.Abstract
This paper focuses on learning index-based policies in rest- less multi-armed bandits (RMAB) with applications to public health concerns such as maternal health. Maternal health is a very important public health concern. It refers to the health of women during their pregnancy, childbirth, and the post- natal period. Although maternal health has received significant attention [World Health Organization, 2015], the number of maternal deaths remains unacceptably high, mainly because of the delay in obtaining adequate care [Thaddeus and Maine, 1994]. Most maternal deaths can be prevented by providing timely preventive care information. However, such information is not easily accessible by underprivileged and low-income communities. For ensuring timely information, a non-profit organization, called ARMMAN [2015], carries out a free call-based program called mMitra for spreading preventive care information among pregnant women. Enrollment in this program happens through hospitals and non-government organizations. Each enrolled woman receives around 140 automated voice calls, throughout their pregnancy period and up to 12 months after childbirth. Each call equips women with critical life-saving healthcare information. This program pro- vides support for around 80 weeks. To achieve the vision of improving the well-being of the enrolled women, it is important to ensure that they listen to most of the information sent to them via the automated calls. However, the organization observed that, for many women, their engagement (i.e., the overall time they spend listening to the automated calls) gradually decreases. One way to improve their engagement is by providing an intervention (that would involve a personal visit by health-care worker). These interventions require the dedicated time of the health workers, which is often limited. Thus, only a small fraction of the overall enrolled women can be provided with interventions during a time period. More- over, the extent to which the engagement improves upon intervention varies among individuals. Hence, it is important to carefully choose the beneficiaries who should be provided interventions at a particular time period. This is a challenging problem owing to multiple key reasons: (i) Engagement of the individual beneficiaries is un- certain and changes organically over time; (ii) Improvement in the engagement of a beneficiary post-intervention is un- certain; (iii) Decision making with respect to interventions (which beneficiaries should have intervention) is sequential, i.e., decisions at a step have an impact on the state of beneficiaries and decisions to be taken at the next step; (iv) Number of interventions are budgeted and are significantly smaller than the total number of beneficiaries. Due to the uncertainty, sequential nature of decision making, and weak dependency amongst patients through a budget, existing research [Lee et al., 2019; Mate et al., 2020; Bhattacharya, 2018] in health interventions has justifiably employed RMABs. However, existing research focuses on the planning problem assuming a priori knowledge of the underlying uncertainty model, which can be quite challenging to obtain. Thus, we focus on learning intervention decisions in absence of the knowledge of underlying uncertainty.
2020
Xu L, Bondi E, Fang F, Perrault A, Wang K, Tambe M. Dual-Mandate Patrols: Multi-Armed Bandits for Green Security. arXiv:2009.06560 [cs, stat]. 2020. Publisher's VersionAbstract
Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitzcontinuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
aaai21_dual_mandate_patrols.pdf
Prins A, Mate A, Killian JA, Abebe R, Tambe M. Incorporating Healthcare Motivated Constraints in Restless Bandit Based Resource Allocation. NeurIPS 2020 Workshops: Challenges of Real World Reinforcement Learning, Machine Learning in Public Health (Best Lightning Paper), Machine Learning for Health (Best on Theme), Machine Learning for the Developing World. 2020. human_in_the_loop_rmab_short.pdf
Mate A*, Killian J*, Xu H, Perrault A, Tambe M. Collapsing Bandits and their Application to Public Health Interventions. Advances in Neural and Information Processing Systems (NeurIPS) . 2020. Publisher's VersionAbstract
We propose and study Collapsing Bandits, a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus “collapsing” any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. The goal is to keep as many arms in the “good” state as possible by planning a limited budget of actions per round. Such Collapsing Bandits are natural models for many healthcare domains in which health workers must simultaneously monitor patients and deliver interventions in a way that maximizes the health of their patient cohort. Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable. Our derivation hinges on novel conditions that characterize when the optimal policies may take the form of either “forward” or “reverse” threshold policies. (ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed form. (iii) We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients’ adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques, while achieving similar performance.
collapsing_bandits_full_paper_camready.pdf
Perrault A, Fang F, Sinha A, Tambe M. AI for Social Impact: Learning and Planning in the Data-to-Deployment Pipeline. AI Magazine. 2020.Abstract
With the maturing of AI and multiagent systems research, we have a tremendous
opportunity to direct these advances towards addressing complex societal problems. In pursuit of this goal of AI for Social Impact, we as AI researchers must go beyond improvements in computational methodology; it is important to step out in the field to demonstrate social impact. To this end, we focus on the problems of public safety and security, wildlife conservation, and public health in low-resource communities, and present research advances in multiagent systems to address one key cross-cutting challenge: how to effectively deploy our limited intervention resources in these problem domains. We present case studies from our deployments around the world as well as lessons learned that we hope are of use to researchers who are interested in AI for Social Impact. In pushing this research agenda, we believe AI can indeed play an important
role in fighting social injustice and improving society.
2001.00088.pdf
Sharma A, Killian J, Perrault A. Optimization of the Low-Carbon Energy Transition Under Static and Adaptive Carbon Taxes via Markov Decision Processes, in AI for Social Good Workshop. ; 2020.Abstract

Many economists argue that a national carbon tax would be the most effective policy for incentivizing the development of low-carbon energy technologies. Yet existing models that measure the effects of a carbon tax only consider carbon taxes with fixed schedules. We propose a simple energy system transition model based on a finite-horizon Markov Decision Process (MDP) and use it to compare the carbon emissions reductions achieved by static versus adaptive carbon taxes. We find that in most cases, adaptive taxes achieve equivalent if not lower emissions trajectories while reducing the cost burden imposed by the carbon tax. However, the MDP optimization in our model adapted optimal policies to take advantage of the expected carbon tax adjustment, which sometimes resulted in the simulation missing its emissions targets.

Back to AI for Social Good event

Optimization of the Low-Carbon Energy Transition Under Static and Adaptive Carbon Taxes via Markov Decision Processes
Zhang A, Perrault A. Influence Maximization and Equilibrium Strategies in Election Network Games, in AI for Social Good Workshop. ; 2020.Abstract

Social media has become an increasingly important political domain in recent years, especially for campaign advertising. In this work, we develop a linear model of advertising influence maximization in two-candidate elections from the viewpoint of a fully-informed social network platform, using several variations on classical DeGroot dynamics to model different features of electoral opinion formation. We consider two types of candidate objectives—margin of victory (maximizing total votes earned) and probability of victory (maximizing probability of earning the majority)—and show key theoretical differences in the corresponding games, including advertising strategies for arbitrarily large networks and the existence of pure Nash equilibria. Finally, we contribute efficient algorithms for computing mixed equilibria in the margin of victory case as well as influence-maximizing best-response algorithms in both cases and show that in practice, as implemented on the Adolescent Health Dataset, they contribute to campaign equality by minimizing the advantage of the higherspending candidate.

Back to AI for Social Good event

Influence Maximization and Equilibrium Strategies in Election Network Games
Wilder B, Charpingon M, Killian J, Ou H-C, Mate A, Jabbari S, Perrault A, Angel Desai M. Inferring between-population differences in COVID-19 dynamics, in AI for Social Good Workshop. ; 2020.Abstract

As the COVID-19 pandemic continues, formulating targeted policy interventions supported by differential SARS-CoV2 transmission dynamics will be of vital importance to national and regional governments. We develop an individual-level model for SARS-CoV2 transmission that accounts for location-dependent distributions of age, household structure, and comorbidities. We use these distributions together with age-stratified contact matrices to instantiate specific models for Hubei, China; Lombardy, Italy; and New York, United States. We then develop a Bayesian inference framework which leverages data on reported deaths to obtain a posterior distribution over unknown parameters and infer differences in the progression of the epidemic in the three locations. These findings highlight the role of between-population variation in formulating policy interventions.

Back to AI for Social Good event

Inferring between-population differences in COVID-19 dynamics
Wilder B, Charpignon M, Killian JA, Ou H-C, Mate A, Jabbari S, Perrault A, Desai A, Tambe M, Majumder MS. The Role Of Age Distribution And Family Structure On Covid-19 Dynamics: A Preliminary Modeling Assessment For Hubei And Lombardy, in SSRN. ; 2020.Abstract

Background: The COVID-19 outbreak has already caused significant mortality worldwide. As the epidemic accelerates, understanding the transmission dynamics of COVID-19 is crucial to informing national and regional policies. We develop an individual-level model for SARS-CoV2 transmission which accounts for location-dependent distributions of age and household structure. We apply our model to Hubei, China and Lombardy, Italy to analyze the impact of demographic structure on estimates for key parameters such as the rate of documentation and the reproduction number r0 for COVID-19 cases. We also assess the effectiveness of potential policies ranging from physical distancing to sheltering in place in Lombardy.

Methods: Our study develops a stochastic, agent-based model for SARS-CoV2 spread. A key feature of the model is the inclusion of population-specific demographic structure, such as the distributions of age, household structure, contact across age groups, and comorbidities. We use prior estimates of these demographic features to instantiate our model for two locations: Hubei, China and Lombardy, Italy. Furthermore, we utilize the data on the number of reported deaths due to COVID-19 in both locations to estimate parameters describing location-specific variation in the transmissibility and fatality of the disease (for reasons beyond demography). The range of the parameters in our model that are consistent with reported data are used to construct plausible ranges for r0 and the rate of documentation in each location. Finally, we analyze potential policy responses in the context of Lombardy. Our analysis traces out the trade-off between adoption of physical distancing across the entire population and policies that encourage members of a specific age group to shelter at home.

Results: Our estimates for r0 are comparable to the rest of the literature, with a range of 2.11–2.27 for Hubei and 2.50-3.20 for Lombardy, suggesting higher rates of transmission in the latter. Scenarios where the case fatality rates are higher in Lombardy than Hubei by a factor of 1-5 times appear plausible given the data (even after accounting for differences in age and comorbidity distributions). We estimate the rate at which symptomatic cases are documented to be at 10.3-19.2% in Hubei and 1.2-8% in Lombardy, indicating that the number of undocumented cases may be even higher than has previously been estimated. Evaluation of potential policies suggests that encouraging a single age group to shelter in place is insufficient to control the epidemic by itself, but that targeted "salutary sheltering" by even 50% of a single age group has a substantial impact when combined with adoption of physical distancing by the rest of the population.

covid_19_family_structure_8.pdf
Mate A, Killian JA, Wilder B, Charpignon M, Awasthi A, Tambe M, Majumder MS. Evaluating COVID-19 Lockdown Policies For India: A Preliminary Modeling Assessment for Individual States., in SSRN. ; 2020.Abstract

Background: On March 24, India ordered a 3-week nationwide lockdown in an effort to control the spread of COVID-19. While the lockdown has been effective, our model suggests that completely ending the lockdown after three weeks could have considerable adverse public health ramifications. We extend our individual-level model for COVID-19 transmission [1] to study the disease dynamics in India at the state level for Maharashtra and Uttar Pradesh to estimate the effect of further lockdown policies in each region. Specifically, we test policies which alternate between total lockdown and simple physical distancing to find "middle ground" policies that can provide social and economic relief as well as salutary population-level health effects.

Methods: We use an agent-based SEIR model that uses population-specific age distribution, household structure, contact patterns, and comorbidity rates to perform tailored simulations for each region. The model is first calibrated to each region using publicly available COVID-19 death data, then implemented to simulate a range of policies. We also compute the basic reproduction number R0 and case documentation rate for both regions.

Results: After the initial lockdown, our simulations demonstrate that even policies that enforce strict physical distancing while returning to normal activity could lead to widespread outbreaks in both states. However, "middle ground" policies that alternate weekly between total lockdown and physical distancing may lead to much lower rates of infection while simultaneously permitting some return to normalcy.

ssrn-covid_lockdownpolicies_india.pdf
Ou H-C, Sinha A, Suen S-C, Perrault A, Raval A, Tambe M. Who and When to Screen Multi-Round Active Screening for Network Recurrent Infectious Diseases Under Uncertainty, in International Conference on Autonomous Agents and Multiagent Systems (AAMAS-20). ; 2020.Abstract
Controlling recurrent infectious diseases is a vital yet complicated problem in global health. During the long period of time from patients becoming infected to finally seeking treatment, their close contacts are exposed and vulnerable to the disease they carry. Active screening (or case finding) methods seek to actively discover undiagnosed cases by screening contacts of known infected people to reduce the spread of the disease. Existing practice of active screening methods often screen all contacts of an infected person, requiring a large budget. In cooperation with a research institute in India, we develop a model of the active screening problem and present a software agent, REMEDY. This agent assists maximizing effectiveness of active screening under real world budgetary constraints and limited contact information. Our contributions are: (1) A new approach to modeling multi-round network-based screening/contact tracing under uncertainty and proof of its NP-hardness; (2) Two novel algorithms, Full- and Fast-REMEDY. Full-REMEDY considers the effect of future actions and provides high solution quality, whereas Fast-REMEDY scales linearly in the size of the network; (3) Evaluation of Full- and Fast-REMEDY on several real-world datasets which emulate human contact to show that they control diseases better than the baselines. We also show that the software agent is robust to errors in estimates of disease parameters, and incomplete information of the contact network. Our software agent is currently under review before deployment as a means to improve the efficiency of district-wise active screening for tuberculosis in India.
who_and_when_to_screen.pdf