Utility-Cost of Provable Privacy: A Case Study on US Census Data
Abstract: Privacy is an important constraint that algorithms must satisfy when analyzing sensitive data from individuals. Differential privacy has revolutionized the way we reason about privacy, and has championed the need for data analysis algorithms with provable privacy guarantees. Differential privacy and its variants have arisen as the gold standard for exploring the tradeoff between the privacy ensured to individuals and the utility of the statistical insights mined from the data, and are in use by many commercial (e.g., Google and Apple) and government entities (e.g., US Census) for collecting and sharing sensitive user data. In today's talk I will highlight key challenges in designing differentially private algorithms for emerging applications, and highlight research from our group that try to address these challenges. In particular I will describe our recent work on modernizing the data publication process for a US Census Bureau data product, called LODES/OnTheMap. In this work, we identified legal statutes and their current interpretations that regulate the publication of LODES/OnTheMap data, formulated these regulations mathematically, and designed algorithms for releasing tabular summaries that provably ensured these privacy requirements. Our solutions are able to release summaries of the data with error comparable or even better than current releases (which are not provably private), for reasonable settings of privacy parameters.
Bio: Ashwin Machanavajjhala is an Assistant Professor in the Department of Computer Science, Duke University. Previously, he was a Senior Research Scientist in the Knowledge Management group at Yahoo! Research. His primary research interests lie in algorithms for ensuring privacy in statistical databases and augmented reality applications. He is a recipient of the National Science Foundation Faculty Early CAREER award in 2013, and the 2008 ACM SIGMOD Jim Gray Dissertation Award Honorable Mention. Ashwin graduated with a Ph.D. from the Department of Computer Science, Cornell University and a B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Madras.