True Friends Let You Down: Benchmarking Social Graph Anonymization Schemes


Greater demand for social graph data among researchers and analysts has fueled an increase in such datasets being published. Consequently, concerns about privacy breach have also risen steadily. To mitigate privacy risks a myriad of social graph anonymization schemes have been proposed. Anonymizing high dimensional data is a very hard problem and conventionally it is considered unwise to publish graph data even without identifiers. Often the schemes proposed provide no proof of efficacy and are designed to defeat only a narrow set of attacks. To facilitate benchmarking of perturbation-based social graph anonymization schemes we design and implement a machine learning framework. The framework provides a quick and automated platform to evaluate and compare anonymization schemes. We present a mechanism to train the framework without ground truth by generating subgraphs and sampling training data from the auxiliary and sanitized graphs. We design and implement node features based on graph structure that can be easily tuned to accommodate weak or strong adversaries as well as node and edge attributes. The framework provides a granular graph structure-based metric to capture the likelihood of a node being re-identified. We conduct a thorough analysis of the effect of graph perturbation on anonymity achieved and utility preserved using publicly available real world social graphs. To this end we analyze six popular graph perturbation schemes including those promising k-anonymity. Our techniques automate weeding out poor anonymization schemes. Experiments show that it is hard to provide anonymity while preserving utility whereas some schemes destroy utility without providing much anonymity. All useful anonymization schemes leave a fraction a true edges intact and these true friends lead to the re-identification of nodes.



author = {Sharad, Kumar},
title = {True Friends Let You Down: Benchmarking Social Graph Anonymization Schemes},
booktitle = {Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security},
series = {AISec '16},
year = {2016},
isbn = {978-1-4503-4573-6},
location = {Vienna, Austria},
pages = {93--104},
numpages = {12},
url = {},
doi = {10.1145/2996758.2996765},
acmid = {2996765},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {de-anonymization, machine learning, privacy, social networks},