Project Details
Accurate and fast AI-based methods for predicting and classifying structurally resolved protein interactomes
Applicant
Dr. Katja Luck
Subject Area
Bioinformatics and Theoretical Biology
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 551068697
While great progress has been made toward a full map of the human protein-protein interactome, we still lack structural information for the vast majority of known protein interactions, and those are estimated to still be a small fraction of the full interactome. This lack of knowledge hinders studies of cellular mechanisms and characterization of disease variants, disabling predictions of phenotypes from genotypes. Proteins interact with each other using a variety of different modes of binding. Our understanding of the structural complexity and evolution of protein interaction interfaces is highly incomplete because we lack sufficiently fast and accurate methods to compute and classify interaction interface similarities. We propose to develop novel fast and accurate artificial intelligence-based methods to predict protein-protein interactions (PPIs) at a structural resolution and to compute PPI interface similarities bringing us closer to fully structurally resolved protein interactomes and complete structural repertoires of modes of protein binding. We will improve the accuracy and speed of AlphaFold-Multimer (AF-MM) for PPI predictions using metagenomic sequence information, self-distillation training guided by PPI interfaces and architecture adaptations to speed up inference. We will adapt Foldseek to perform fast and accurate PPI interface similarity computations enabling us to perform all-by-all comparisons of structurally resolved PPI interfaces from the PDB. This will result in clusters of similarly structured interfaces, regardless of protein sequence, fold, or modular protein architecture. We will use our improved AF-MM model to predict the structures of 15 million suspected PPIs within the human proteome and search these predictions for putative novel PPI interface types. We will integrate predicted PPIs with patient-derived mutations to identify likely disease-related novel interface predictions, which will be subjected to experimental validation. Generated datasets will be disseminated via dedicated web resources and explored to predict larger protein complex assemblies, to study patterns of convergent PPI interface evolution, and to aid in the characterization of disease variants. The developed methods and resources will generate and support a wide variety of research pertaining to protein interactomes.
DFG Programme
Research Grants
International Connection
South Korea
Partner Organisation
National Research Foundation of Korea, NRF
Cooperation Partner
Professor Dr. Martin Steinegger