Challenges of Differentially Private Prediction in Healthcare Settings

Citation:

Suriyakumar VM, Papernot N, Goldenberg A, Ghassemi M. Challenges of Differentially Private Prediction in Healthcare Settings, in IJCAI 2021 Workshop on AI for Social Good. ; 2021.

Abstract:

Privacy-preserving machine learning is becoming increasingly important as models are being used on sensitive data such as electronic health records. Differential privacy is considered the gold standard framework for achieving strong privacy guarantees in machine learning. Yet, the performance implications of learning with differential privacy have not been characterized in the presence of time-varying hospital policies, care practices, and known class imbalance present in health data. First, we demon- strate that due to the long-tailed nature of health- care data, learning with differential privacy results in poor utility tradeoffs. Second, we demonstrate through an application of influence functions that learning with differential privacy leads to disproportionate influence from the majority group on model predictions which results in negative consequences for utility and fairness. Our results high- light important implications of differentially private learning; which focuses by design on learn- ing the body of a distribution to protect privacy but omits important information contained in the tails of healthcare data distributions.