Project Details
Priv-GSyn: Privacy-Preserving Graph Synthesis
Applicant
Professorin Dr. Stefanie Roos
Subject Area
Security and Dependability, Operating-, Communication- and Distributed Systems
Term
since 2025
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 550302673
Many research directions, such as studying and predicting the spread of infectious diseases, critically depend on the availability of graph data. Yet, such data can have sensitive information such as health information. For instance, when given a social graph that characterizes the spread of a disease, it is known which nodes carried the disease. It has been shown that even if the nodes are not explicitly mapped to real-world identities, these identities can often be inferred, which reveals critical health data about an individual. Prior research on publishing graphs in a privacy-preserving manner focuses almost entirely on publishing real graphs and disregards the option of using synthetic graphs despite the fact that graph generative models have become available in recent years. On the surface, it seems like the use of synthetic graphs should overcome privacy issues but works on other data such as tables indicate that generative models still maintain sensitive information about the training data unless the training process includes appropriate privacy protections. The absence of any consideration of privacy considerations for graph generative models is a severe oversight of the current research in the field. Moreover, the existing graph generative models are limited in capturing the both graph structure and node/edge attributes simultaneously and further lack generalization to large-scale and temporal graphs.In order to ensure that graph generative models offer a high-utility privacy-preserving alternative to the publication of real-world graphs, significant research is required. In this proposal, we aim to address two fundamental research questions: i) how can we quantify and defend the privacy risks of synthetic graphs? and ii) how can we generate synthetic graphs that strike an optimal privacy-utility tradeoff? We design 6 work packages, three of which explore modeling privacy, privacy attacks, defense for synthetic graphs, and three of which focus on deriving the conditional and temporal graph generative models trained centrally as well as on federated clients. Our results will raise awareness for the importance of privacy in graph generative models and provide methods and tools to publish graphs safely. Concretely, we will develop algorithms that better capture the relations between node and edge attributes when generating static and temporal graphs while at the same time ensuring that the generated graphs meet strict privacy requirements.
DFG Programme
Research Grants
International Connection
Switzerland
Partner Organisation
Schweizerischer Nationalfonds (SNF)
Cooperation Partner
Professorin Lydia Y. Chen, Ph.D.