Project Details
Projekt Print View

Automatic Localization of Complete Protein Coding Genes in Eukaryotic Genomes

Subject Area Bioinformatics and Theoretical Biology
Term from 2007 to 2010
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 61416996
 
The project aims at developing a gene finding tool that takes advantage of the recently introduced next-generation sequencing technologies for more accurate genome annotation. New ultra-high throughput sequencing technologies allow much cheaper and more extensive (’deeper’) sequencing of the transcripts of genes, termed RNA-Seq. RNA-Seq opens up new chances to improve genome annotation procedures, that currently still mispredict many gene structures which strongly impairs the downstream analysis of the predicted genes. RNA-Seq also poses new challenges, in particular because the sequence fragments are generally much shorter than in conventional technologies. Based on one of the currently most accurate gene finders, AUGUSTUS, a genome annotation pipeline that exploits RNA-Seq data will be developed. RNA-Seq data from the various new sequencing platforms will be integrated in addition to established evidence types such as various homology approaches and proteogenomics. The goal is to significantly reduce the error rate and to report alternative splice forms more sensitively and confidently. The universally applicable annotation pipeline will be freely available for independent use but also applied directly in some genome projects. If successful, this project would allow a significantly more accurate annotation of many eukaryotic genomes accompanied by transcriptome sequencing, and therefore provide a better basis for other studies with potential biomedical, biotechnological or agricultural applications.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung