Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Distributed and Private Machine Learning

On Privacy and Confidentiality of Communications in Organizational Graphs

Masoumeh Shafieinejad · Huseyin Inan · Marcello Hasegawa · Robert Sim


Abstract:

Machine learned models trained on organizational communication data, such as emails in an enterprise, carry unique risks of breaching confidentiality, even if the model is intended only for internal use. This work shows how confidentiality is distinct from privacy in an enterprise context, and aims to formulate an approach to preserving confidentiality while leveraging principles from differential privacy (DP). Works that apply DP techniques to natural language processing tasks usually assume independently distributed data, and overlook potential correlation among the records. Ignoring this correlation results in a fictional promise of privacy while, conversely, extending DP techniques to include group privacy is over-cautious and severely impacts model utility. We introduce a middle-ground solution, proposing a model that captures the correlation in the social network graph, and incorporates this correlation in the privacy calculations through Pufferfish privacy principles.

Chat is not available.