Project Details
Projekt Print View

MetaProteomics Pipeline (MPP): Integrating a stack of metaproteomics data analysis tools into a full-featured and sustainable bioinformatics workflow

Subject Area Bioinformatics and Theoretical Biology
Microbial Ecology and Applied Microbiology
Term from 2018 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 391179955
 
The main objective of this project proposal is to establish the Metaproteomics Pipeline (MPP), a sustainable, full-featured, and user-friendly data analysis workflow for metaproteomics research which integrates three existing software prototypes.Metaproteomics is an evolving field of microbiology studying microbial communities directly from their habitat by applying mass spectrometry-based protein analytics. This research area strongly demands tailored software solutions that enable researchers from various backgrounds (microbiology, ecology, medical diagnostics etc.) to apply adequate and state-of-the-art bioinformatics strategies for analysing highly complex and heterogeneous biological samples. Since such samples contain proteins of hundreds of potentially unknown species, data analysis and interpretation belong to the most challenging tasks. This includes the reliable identification of acquired mass spectra, the efficient grouping of ambiguous (peptide-sharing) protein identifications, the accurate quantification of identified proteins, and the meaningful integration of taxonomic and functional meta-information. To support these tasks and to overcome existing limitations, we developed the open source software tools MetaProteomeAnalyzer (MPA), Pipasic, and Prophane. These prototypes provide effective features for the meaningful analysis and interpretation of metaproteomics data. The scope of this project is to combine the existing prototypes into a single, end-to-end solution providing high usability, reliability, interoperability, and discoverability to the metaproteomics research community. In six different work packages we will focus on (i.) standard workflow definition and implementation, (ii.) automated quality assurance using benchmarking data, (iii.) data format definition for interoperability with other tools, (iv.) infrastructure for accessibility, (v.) documentation, training, and improved discoverability, and (vi.) dissemination of results and impact measurement.The pipeline will facilitate the time- and cost-efficient analysis of metaproteomics data. Users lacking in-depth knowledge either in metaproteomics or bioinformatics will be able to conduct data analysis without experiencing compatibility issues. Domain experts will be involved as pilot users already at an early stage to evaluate and validate the pipeline with respect to its performance on metaproteomics data. Moreover, an automated benchmarking will be provided not only for evaluating our developed pipeline, but also for testing other software in the field. Methods of semantic software annotation, documentation, training and distribution on community platforms will make the workflow accessible to a wider research community. Usage statistics and surveys from pilot and end users will help us to set up a sustainable bioinformatics pipeline which considers best-practices in data analysis and workflow design.
DFG Programme Research data and software (Scientific Library Services and Information Systems)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung