Project Details
Projekt Print View

Protecting Creativity: Towards Safe Generative Models

Subject Area Security and Dependability, Operating-, Communication- and Distributed Systems
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2024
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 545047250
 
In recent times, generative machine learning (ML) models, like ChatGPT and Stable diffusion, have grown rapidly in creating high-quality texts and images. Despite their remarkable achievements in producing high-quality outputs, this project underscores the imperative of prioritizing their safety, a facet frequently understudied. Safety here means ensuring the models operate without causing unintended harm, especially in terms of protecting the confidentiality and integrity of the processed data, and guarding against the generation of misleading or harmful content. Additionally, it refers to the safety of the models themselves against illegitimate usage. Our project is built upon a fundamental hypothesis that underscores the vital connection between the safety of generative models and how their training data is managed: the bedrock of reliable and legally compliant AI systems lies in the nature and handling of the data that these systems are trained on. To develop this connection, we have structured our research into three parts. Our first objective is on exploring the learning mechanisms of generative models, specifically focusing on their capabilities to “memorize” training data. This aspect is particularly critical as it often leads to unintended exposure of sensitive or copyrighted material during the models' inference stage. Our second objective is centered around devising innovative verification techniques designed to serve two primary scenarios: The first scenario empowers individuals, such as artists, to determine if their creative work, even not explicitly memorized by the model, was improperly used in its training. The second scenario, termed output verification, aims at enabling model owners to confirm or refute the generation of potentially harmful content by their models. Finally, we propose groundbreaking strategies for safeguarding both generative models and the parties who trained them (usually API providers) using high amounts of computational power and manual labor. Our approach involves the development of active defenses to thwart attempts at stealing the models. In cases where the theft still occurs, we explore ownership resolution techniques to legally challenge such actions. In summary, our project aims to enhance the safety of generative models by investigating memorization, devising verification techniques, and proposing strategies for safeguarding both models and data. By compiling our methods into a safe generative ML framework and validating it in real-life applications, we aspire to contribute to the responsible deployment of generative models, ensuring their integrity and confidentiality in the evolving landscape of artificial intelligence. We plan to disseminate our framework through educational materials and an open-source library as well as validate the effectiveness of our methods on real-life applications.
DFG Programme Research Grants
International Connection Poland
Co-Investigator Dr. Franziska Boenisch
 
 

Additional Information

Textvergrößerung und Kontrastanpassung