George Kellaris on the Fraught Line between Usefulness and Privacy

I meet George Kellaris at Hi-Rise Bakery, where he waves off my attempts to pay for coffee. Kellaris is a second-year postdoc who divides his time between Harvard’s Center for Research on Computation and Society and the Computer Science program at Boston University. As an undergraduate, Kellaris chose computer science over his other passion – psychology. But he did not forsake the social sciences entirely. Indeed, Kellaris is a fine example of CRCS’ central mission – to do computational science research not for its own sake, but for the good of society. He decided to work on privacy issues because, as he says, “The UN recognizes privacy in its Universal Declaration of Human Rights. It’s a basic thing. And more people in the public and private sector are developing awareness of privacy issues. There’s a demand for knowledge there.” I ask him to help me understand his research, and his brow furrows. Throughout our conversation, he is deliberate and thoughtful, saying, “you don’t really know something unless you can explain it to people with other backgrounds, right?” My background is dramatically different. Kellaris manages, without being patronizing, to explain his work in terms that I can understand.

He describes how an increasing amount of data is being accumulated about people’s lives and used for research purposes, compromising the basic human right to privacy. His research sits squarely at this point of intersection between the usefulness of research data and the privacy of subjects. He hopes to help strike a balance between these two forces - not asking if data should be collected and used, but how it can be done ethically.

He goes on to explain the three basic research projects in which he’s presently involved. The first is as a member of CRCS’ Privacy Tools Project. Dataverse, a platform run by Harvard’s Institute for Quantitative Social Science to allow the sharing of data among scientists for research purposes, is not as private as it needs to be. Its use raises the question of how to conduct data analysis without violating individual privacy. As part of the Privacy Tools team, Kellaris has helped in designing applications and algorithms that allow people to accomplish this goal using a tool called differential privacy – a rigorous mathematical definition of privacy that maximizes the accuracy of queries from statistical databases while minimizing the chances of identifying its records. This tool will interface with the Dataverse project to ensure the privacy of its subjects.

His second project, also through CRCS and funded by a grant from the National Science Foundation, addresses the needs of companies that have to outsource their data because they don’t have the resources to maintain it. When companies that lack resources gather data from their clients, they are compelled to give it to third parties to maintain or upload it into the cloud. The Computing Over Distributed Data Project ensures that these third parties manage their client’s data in a way that follows the triad of information security. This triad consists of three requirements, also known as CIA: confidentiality (only the owner of the data can access that data), integrity (the cloud and other third parties cannot alter or delete the data) and availability (the owner of the data has the power to access, query, and retrieve it). The requirements of the CIA triad are generally met through encryption. Many commercial products argue that they satisfy the triad because they enable encryption, but actually fail to satisfy confidentiality.

Kellaris (alongside Kobbi Nissim, George Kollios, and Adam O’Neill) has proven that the process of ensuring that an owner has basic access to their data can make it relatively easy for a third party to infer confidential information about the users through that data. By pinpointing the leakage channels through which privacy is violated, Kellaris’ work has shown theoretically that encryption alone may not be enough to satisfy privacy. By perturbing these leakage channels using differential privacy, they proved that a system can maintain confidentiality without its efficiency deteriorating. The next step is to build a commercial product based on this proof of concept that uses differential privacy in combination with strong encryption. This system will adhere to the triad of information security in a legitimate way.

Kellaris’ third project is also related to third party data storage. He is part of a team of researchers from MIT, Boston University, Northeastern University, and the University of Connecticut known as “MACS: Modular Approach to Cloud Security.” This project contributes security features to the Massachusetts Open Cloud, which wants to share information about resource allocation with clients of the Cloud. Currently, companies must reverse engineer in order to figure out how much memory, storage, and computational power they need to purchase from the Cloud in order to run their services. If the Cloud was transparent about who was using their services, and to what extent, then companies could make educated decisions about what to purchase. However, such a system would make it relatively easy for an attacker to locate where in the Cloud any given user or service resides, and thereby mount an attack. In order for the Massachusetts Open Cloud to be successful, it must contend with the dangers of transparency, and protect its users against privacy attacks.

Kellaris, as part of the MACS project, is devising new privacy tools - adaptations of differential privacy – that target the problem of attackers figuring out exactly where users are in the Cloud. These new privacy tools – which disguise the current location of their users – will have positive ramifications beyond their use by the Massachusetts Open Cloud. “Think about a VIP, someone who is vulnerable to a terrorist attack,” Kellaris says. “They can’t have anyone knowing their exact whereabouts, so a product that disguises the current location of its user could be very useful politically.” In general, Kellaris’ goal is to build products and create software that gives strong privacy guarantees while allowing the extraction of useful knowledge about data. He lists three particular contexts in which such tools are necessary – when the current location of a person of interest must be disguised in order to protect them; when the location of a service in the cloud must be obscured; and when data must be outsourced to a third party while still satisfying the triad of information security.

Kellaris has European privacy standards as his main reference point. “The European system gives more weight to privacy than the American one,” he says. “We think privacy really is a basic human right. I like that my research matters in practical ways. I don’t do research only for research’s sake. I am helping to solve everyday problems that arise through new technologies.” George leans forward over his lemonade. His passion for his work is evident, and I make a joke – a stupid joke – about how seriously he takes this stuff. His breaks into a laugh. With exaggerated gravitas, he says, “I am a superhero.” As his postdoc wraps up, George ramps up his job search. “Privacy Man. I will go where I am most needed.”