Modeling the sequence-structure-function relationships of ThDP-dependent enzymes
Final Report Abstract
Enzymes are ingenious nanomachines: a linear program of only 300-400 lines of code and 20 different instructions encodes an efficient folding pathway from the nascent amino acid chain into a stable, though dynamic structure, which acts as a highly effective and selective catalyst. While we can read and interpret this program by sequence and structure comparison, we are only beginning to learn how to write it and thus design novel enzymes. By using data mining and machine learning techniques, the exponentially growing space of known protein sequences is studied, while mechanistic modeling of enzyme-substrate and enzyme-solvent interactions deepens our insight into the molecular basis of protein stability, catalytic activity, substrate specificity, and selectivity. Mechanistic models are validated by predicting variants with changed substrate profile and shifted stereoselectivity, which are subsequently produced and biochemically characterized by analyzing enzyme kinetics. Project 8 "Modeling the sequence-structure-function relationships of ThDP-dependent enzymes" aims at developing solutions to three major challenges: How to systematically compare the rapidly growing number of ThDP-dependent enzymes sequences and structures? How to establish a generic and predictive mechanistic model of substrate specificity and stereoselectivity? And how to integrate sequence and structure data with data from biocatalytic experiments to make them findable, accessible, interoperable, and re-usable? The Forschergruppe 1296 provided a broad transdisciplinary framework to embed our data-integrated simulation strategy into a unique infrastructure with expertise from synthetic chemistry, biocatalysis, and biochemical engineering. The results of Project 8 have been published in 20 scientific papers since 2010. The regularly updated Thiamine diphosphate dependent Enzyme Engineering Database (TEED) provides public access to sequences and structures of ThDP-dependent enzymes and provides the framework for data mining such as conservation analyses, evolutionary studies, and the identification of novel biocatalysts. A predictive molecular model of the substrate binding site of the largest superfamily, the ThDP-dependent decarboxylases, was applied to design mutants with shifted stereoselectivity and extended substrate range. Recently, the model was generalized to the second largest superfamily, the transketolases. The development of novel enzymes for biocatalytic processes requires knowledge on substrate profile and selectivity, which can be derived from separate databases and publications. Often, these data sources lack the time courses of substrate or product, and an unambiguous link between experiment and enzyme sequence. The BioCatNet platform was established to integrate original biocatalytic data with protein sequence and structure data according to the recommendations of the STRENDA Consortium. BioCatNet facilitates the consistent documentation of reaction conditions, the archiving of measured time course data, and the efficient exchange of original experimental data among collaborators according to the FAIR data principles, and makes biocatalytic data accessible to kinetic modeling of biocatalytic reactions. While FOR1296 focused on data mining and modeling of ThDP-dependent enzymes, the tools and strategies developed in Project 8 are generic and have already been successfully applied to other enzyme families. The results of Project 8 provide quantitative models to explain biocatalytic data, a deeper understanding of the molecular basis of enzyme function, useful methods to study protein evolution, and contribute to the toolbox of synthetic biology.
Publications
- The Thiamine diphosphate dependent Enzyme Engineering Database: A tool for the systematic analysis of sequence and structure relations. BMC Biochem. 2010, 11: 9
Widmann M, Radloff R, Pleiss J
(See online at https://doi.org/10.1186/1471-2091-11-9) - A standard numbering scheme for thiamine diphosphate-dependent decarboxylases. BMC Biochem. 2012, 13: 24
Vogel C, Widmann M, Pohl M, Pleiss J
(See online at https://doi.org/10.1186/1471-2091-13-24) - A Tailor-Made Chimeric Thiamine Diphosphate Dependent Enzyme for the Direct Asymmetric Synthesis of (S)-Benzoins. Angew. Chem. Int. Ed. 2014, 53: 9376–9379
Westphal R, Vogel C, Schmitz C, Pleiss J, Müller M, Pohl M, Rother D
(See online at https://doi.org/10.1002/anie.201405069) - The modular structure of ThDP-dependent enzymes. Proteins 2014, 82: 2523–2537
Vogel C, Pleiss J
(See online at https://doi.org/10.1002/prot.24615) - BioCatNet: a database system for the integration of enzyme sequences and biocatalytic experiments. ChemBioChem 2016, 17: 2093–2098
Buchholz PCF, Vogel C, Reusch W, Pohl M, Rother D, Spiess AC, Pleiss J
(See online at https://doi.org/10.1002/cbic.201600462) - Thermodynamic activity–based interpretation of enzyme kinetics. Trends Biotechnol. 2017, 35: 379–382
Pleiss J
(See online at https://doi.org/10.1016/j.tibtech.2017.01.003)