Applications of artificial intelligence for wildlife protection have focused on learning models of poacher behavior based on historical patterns. However, poachers’ behaviors are described not only by their historical preferences, but also their reaction to ranger patrols. Past work applying machine learning and game theory to combat poaching have hypothesized that ranger patrols deter poachers, but have been unable to find evidence to identify how or even if deterrence occurs. Here for the first time, we demonstrate a measurable deterrence effect on real-world poaching data. We show that increased patrols in one region deter poaching in the next timestep, but poachers then move to neighboring regions. Our findings offer guidance on how adversaries should be modeled in realistic gametheoretic settings.
An ongoing challenge in machine learning is to improve the transparency of learning models, helping end users to build trust and defend fairness and equality while protecting individual privacy and information assets. Transparency is a timely topic given the increasing application of machine learning techniques in the real world, and yet much more progress is needed in addressing the transparency issues. We propose critical research questions on transparency-aware machine learning on two fronts: know how and know that. Know-how is concerned with searching for a set of decision objects (e.g. functions, rules, lists, and graphs) that are cognitively fluent for humans to apply and consistent with the original complex model, while know-that is concerned with gaining more in-depth understanding of the internal justification of the decisions through external constraints on accuracy, consistency, privacy, reliability, and fairness.
During the COVID-19 pandemic, committees have been appointed to make ethically difficult triage decisions, which are complicated by the diversity of stakeholder interests involved. We propose a disciplined, automated approach to support such difficult collective decision-making. Our system aims to recommend a policy to the group that strikes a compromise between potentially conflicting individual preferences. To identify a policy that best aggregates individual preferences, our system first elicits individual stakeholder value judgements by asking a moderate number of strategically selected queries, each taking the form of a pairwise comparison posed to a specific stakeholder. We propose a novel formulation of this problem that selects which queries to ask which individuals to best inform the downstream recommendation problem. Modeling this as a multi-stage robust optimization problem, we show that we can equivalently reformulate this as a mixed-integer linear program which can be solved with off-the-shelf solvers. We evaluate the performance of our approach on the problem of recommending policies for allocating critical care beds to patients with COVID-19. We show that asking questions intelligently allows the system to recommend a policy with a much lower regret than asking questions randomly. The lower regret suggests that the system is suited to help a committee reach a better decision by suggesting a policy that aligns with stakeholder value judgments.
The Google Trends data of some keywords have strong correlations with COVID-19 hospitalizations. We attempt to use these correlations and show an experimental procedure using a simple LSTM model to nowcast hospitalization peaks using Google Trends data. Experiments are done on French regions and on Belgium. This is a preliminary work, that would need to be tested during a (hopefully non-existing) second peak.
Social media has quickly grown into an essential tool for people to communicate and express their needs during crisis events. Prior work in analyzing social media data for crisis management has focused primarily on automatically identifying actionable (or, informative) crisis-related messages. In this work, we show that recent advances in Deep Learning and Natural Language Processing outperform prior approaches for the task of classifying informativeness and encourage the field to adopt them for their research or even deployment. We also extend these methods to two sub-tasks of informativeness and find that the Deep Learning methods are effective here as well.
In health care organizations, a patient’s privacy is threatened by the misuse of their electronic health record (EHR). To monitor privacy intrusions, logging systems are often deployed to trigger alerts whenever a suspicious access is detected. However, such mechanisms are insufficient in the face of small budgets, strategic attackers, and large false positive rates. In an attempt to resolve these problems, EHR systems are increasingly incorporating signaling, so that whenever a suspicious access request occurs, the system can, in real time, warn the user that the access may be audited. This gives rise to an online problem in which one needs to determine 1) whether a warning should be triggered and 2) the likelihood that the data request will be audited later. In this paper, we formalize this auditing problem as a Signaling Audit Game (SAG). A series of experiments with 10 million real access events (containing over 26K alerts) from Vanderbilt University Medical Center (VUMC) demonstrate that a strategic presentation of warnings adds value in that SAGs realize significantly higher utility for the auditor than systems without signaling.
Discharge summaries are essential for the transition of patients’ care but often lack sufficient information. We present an attention-based model to generate discharge summaries to support communication during the transition of care from intensive care units (ICU) to community care. We trained and evaluated our approach over 500, 000 clinical progress notes. The summaries automatically generated by our model achieve a ROUGE-L of 0.83 when comparing with discharge summaries written by health professionals. We attribute the high performance to our three-step pipeline that incorporates disease and specialist contexts to enrich the summaries with relevant information based on the context of the hospital stay. Additionally, we present a novel visualization of ICU flow of care using MIMIC-III. Our promising results have the potential to improve the pipeline of hospital discharge and continuous health care.
Research shows that providing an appliance-wise energy breakdown can help users save up to 15% of their energy bills. Non-intrusive load monitoring (NILM) or energy disaggregation is the task of estimating the household energy measured at the aggregate level for each constituent appliances in the household. The problem was first was introduced in the 1980s by Hart. Over the past three decades, NILM has been an extensively researched topic by researchers. NILMTK was introduced in 2014 to the NILM community in order to motivate reproducible research. Even after the introduction of the NILMTK toolkit to the community, there has been a little contribution of recent state-of-the-art algorithms back to the toolkit. In this paper, we propose a new disaggregation API, which further simplifies the process for the rapid comparison of different state-of-the-art algorithms across a wide range of datasets and algorithms. We also propose a new rewrite for writing the new disaggregation algorithms for NILMTK, which is similar to Scikitlearn. We demonstrate the power of the new API by conducting various complex experiments using the API.
COVID-19 Prevention, which combines the soft approaches and best practices for public health safety, is the only recommended solution from the health science and management society side considering the pandemic era. This process must be promoted via facilitation support to collective urban awareness programs through public dialogue and collective intelligence. Moreover, support must be provided throughout the process to perform complex public deliberation to find issues and ideas within existing approaches that can result in better approaches towards prevention. In an attempt to evaluate the validity of such claims in a conflict and COVID-19-affected country like Afghanistan, we conducted a large-scale digital social experiment using conversational AI and social platforms from an info-epidemiology and an info-veillance perspective. This served as a means to uncover an underling truth, give large-scale facilitation support, extend the soft impact of discussion to multiple sites, collect, diverge, converge and evaluate a large amount of opinions and concerns from health experts, patients and local people, deliberate on the data collected and explore collective prevention approaches of COVID-19. Finally, this paper shows that deciding a prevention measure that maximizes the probability of finding the ground truth is intrinsically difficult without utilizing the support of an AI-enabled discussion systems.
We propose a multi-armed bandit setting where each arm corresponds to a subpopulation, and pulling an arm is equivalent to granting an opportunity to this subpopulation. In this setting the decision-maker’s fairness policy governs the number of opportunities each subpopulation should receive, which typically depends on the (unknown) reward from granting an opportunity to this subpopulation. The decision-maker can decide whether to provide these opportunities or pay a predefined monetary value for every withheld opportunity. The decision-maker’s objective is to maximize her utility, which is the sum of rewards minus the cost of withheld opportunities. We provide a no-regret algorithm that maximizes the decisionmaker’s utility and complement our analysis with an almost-tight lower bound. Full version of the paper is available at https://tinyurl.com/y7s9avud.
Monitoring the effectiveness of policy interventions that promote sustainable farming practices has always been a costly affair. It requires an extensive ground presence which is not always available or reliable. In this paper we present our work so far in the application of deep learning techniques to automate the identification of individual parcels (farms). Our study area is located in the central state of Madhya Pradesh in India, where the average landholding size is around 0.6 hectares per farmer. We created a methodology that uses CNN models for segmentation and Canny Edge detector for generating contours. Our future work concentrates on improving the quality of the reference data and applying additional post-processing methods. Overall, we demonstrate how deep learning could be used for providing specific agronomic advice to individual farmers across large areas and the monitoring thereof, something which is essential in mitigating the effects of climate change.
Globally increasing migration pressures call for new modelling approaches in order to design effective policies. It is important to have not only efficient models to predict migration flows but also to understand how specific parameters influence these flows. In this paper, we propose an artificial neural network (ANN) to model international migration. Moreover, we use a technique for interpreting machine learning models, namely Partial Dependence Plots (PDP), to show that one can well study the effects of drivers behind international migration. We train and evaluate the model on a dataset containing annual international bilateral migration from 1960 to 2010 from 175 origin countries to 33 mainly OECD destinations, along with the main determinants as identified in the migration literature. The experiments carried out confirm that: 1) the ANN model is more efficient w.r.t. a traditional model, and 2) using PDP we are able to gain additional insights on the specific effects of the migration drivers. This approach provides much more information than only using the feature importance information used in previous works.
The main idea of the paper is that convolutional neural networks can be applied to very highresolution satellite imagery in order to classify New Delhi into formal (planned colony) vs. informal settlements (Jhuggi Jhopri Clusters). We show that very high-resolution satellite imagery along with convolutional neural networks can achieve high classification accuracy of 95.81%. We find that pretrained deep learning models for computer vision trained on standard image datasets can be effective for classification of informal settlements using satellite imagery, even when there is not a significant amount of training data. Deep learning models can learn image features without hand-crafted features and when coupled with the proliferation of cloud-based computer vision services could democratize the analysis of satellite imagery for humanitarian and developmental purposes.
In this paper, we collect and study Twitter communications to understand the socio-economic impact of COVID-19 in the United States during the early days of the pandemic. With infections soaring rapidly, users took to Twitter asking people to self isolate and quarantine themselves. Users also demanded closure of schools, bars, and restaurants as well as lockdown of cities and states. The communications reveal the ensuing panic buying and the unavailability of some essential goods, in particular toilet paper. We also observe users express their frustration in their communications as the virus spread continued. We methodically collect a total of 530,206 tweets by identifying and tracking trending COVID-related hashtags. We then group the hashtags into six main categories, namely 1) General COVID, 2) Quarantine, 3) Panic Buying, 4) School Closures, 5) Lockdowns, and 6) Frustration and Hope, and study the temporal evolution of tweets in these hashtags. We conduct a linguistic analysis of words common to all the hashtag groups and specific to each hashtag group. Our preliminary study presents a succinct and aggregated picture of people’s response to the pandemic and lays the groundwork for future fine-grained linguistic and behavioral analysis.
Accurately predicting the number of new COVID19 cases is critical to understanding and controlling the spread of the disease as well as effectively managing scarce resources (e.g., hospital beds, ventilators). To this end, we design a regression based ensemble learning model comprising of Linear regression, Ridge, Lasso, ARIMA, and SVR that takes the previous 14 days’ data into account to predict the number of new COVID-19 cases in the short-term. The ensemble model outputs the best performance by taking into account the performance of all the models. We consider data from top 50 countries around the world that have the highest number of confirmed cases between January 21, 2020 and April 30, 2020. Our results in terms of relative percentage error show that the ensemble method provides superior prediction performance for a vast majority of these countries with less than 10% error for 5 countries and less than 40% error for 27 countries.
The COVID-19 virus has led to a world-wide crisis that requires governments and stakeholders to take far-reaching decisions with limited knowledge of their consequences. This paper presents the AS- SOCC model as a valuable decision-support tool for anticipating the consequences of possible measures by considering many interwoven aspects at the individual, group and societal level. Moreover, this paper illustrates how this model can be applied to study the effects of different testing strategies on the spread of the virus and the healthcare system. We found that excluding age groups from random testing was ineffective, while prioritizing test- ing healthcare and education workers was effective, in combination with isolating the household of an infected person.
A neglected dualism is occurring in AI for Social Good involving the lack of encompassing both the role of artificial moral agency and artificial legal reasoning in advanced AI systems. Efforts by AI researchers and AI developers have tended to focus on how to craft and embed artificial moral agents to guide moral decision making when an AI system is operating in the field but have not also focused on and coupled the use of artificial legal reasoning capabilities, which is equally necessary for robust moral and legal outcomes. This paper addresses this problematic neglect and offers insights to overcome a substantive prevailing weakness and vulnerability.
Our work here is motivated by a problem faced by our lo- cal food-bank (Food Bank for the Southern Tier of New York (FBST)) in operating their mobile food pantry program. Every day, FBST uses a truck to deliver food supplies directly to distribution sites (soup kitchens/pantries/etc.). When the truck arrives at a site, the operator observes the demand there and chooses how much to allocate before moving to the next site. The number of people assembling at each site changes from day to day, and the operator typically does not know the demand of later sites (but has a sense of the demand distribution based on previous visits). Finally, the amount of food in the truck is usually insufficient to meet the total demand, and so the operator must under-allocate at each site, while trying to be fair across all sites. The question is: What is a fair allocation here, and how can it be computed? In offline problems, where demands (more generally, utility functions) for all agents are known to the principal, there are many well-studied notions of fair allocation of limited re- sources. A relevant notion in our context is that a fair allocation is one satisfying two desiderata: pareto-efficiency (for any agent to benefit, another must be hurt) and envy-freeness (no agent prefers an allocation received by another). This definition draws its importance from the fact that in many al- location settings, it is both known to be achievable, and also to encompass other natural desiderata (in particular, proportionality, wherein each agent’s utility is at least that achieved under equal allocation). In particular, when goods are divisible, then for a large class of utility functions, an allocation satisfying both is easily computed (via a convex optimization program) by maximizing the Nash Social Welfare (NSW) objective subject to allocation constraints. Many settings, much like the FBST operating their mo- bile food pantry, have principals make decisions online, with incomplete knowledge on the demands for agents to come. However, these principals have access to historical data al- lowing them to generate demand histograms for each agent. Designing allocation algorithms in this setting necessitates utilizing the Bayesian information of the demand distribution to ensure equitable access to the resource, while adapting to the online realization of demands as it unfolds. Guaranteeing pareto-efficiency and envy-freeness simultaneously is impossible in this setting. However, it is important to develop algorithms which achieve probabilistic version of fairness by utilizing the distributional knowledge to develop algorithms that are approximately fair.
In this paper, we present new results on the fair and efficient allocation of indivisible goods to agents whose preferences correspond to matroid rank functions. This is a versatile valuation class, with several desirable properties (monotonicity, submodularity) which naturally models several real-world domains. We use these properties to our advantage: first, we show that when agent valuations are matroid rank functions, a socially optimal (i.e. utilitarian social welfare-maximizing) allocation that achieves envy-freeness up to one item (EF1) exists and is computationally tractable. We also prove that the Nash welfare-maximizing and the leximin allocations both exhibit this fair- ness/efficiency combination, by showing that they can be achieved by minimizing any symmetric strictly convex function of agents’ valuations over utilitarian optimal outcomes. Moreover, for a subclass of these valuation functions based on maximum (unweighted) bipartite matching, we show that a leximin allocation can be computed in polynomial time.
The analysis of satellite imagery will prove a crucial tool in the pursuit of sustainable development. While Convolutional Neural Networks (CNNs) have made large gains in natural image analysis, their application to multi-spectral satellite images (wherein input images have a large number of channels) remains relatively unexplored. In this paper, we compare different methods of leveraging multi-band information with CNNs, demonstrating the performance of all compared methods on the task of semantic segmentation of agricultural vegetation (vineyards). We show that standard industry practice of using bands selected by a domain ex- pert leads to a significantly worse test accuracy than the other methods compared. Specifically, we com- pare: using bands specified by an expert; using all available bands; learning attention maps over the input bands; and leveraging Bayesian optimisation to dictate band choice. We show that simply using all available band information already increases test time performance, and show that the Bayesian optimisation, novelly applied to band selection in this work, can be used to further boost accuracy.