Detailseite
Automatic Localization of Complete Protein Coding Genes in Eukaryotic Genomes
Antragsteller
Professor Dr. Mario Stanke
Fachliche Zuordnung
Bioinformatik und Theoretische Biologie
Förderung
Förderung von 2007 bis 2010
Projektkennung
Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 61416996
The project aims at developing a gene finding tool that takes advantage of the recently introduced next-generation sequencing technologies for more accurate genome annotation. New ultra-high throughput sequencing technologies allow much cheaper and more extensive (’deeper’) sequencing of the transcripts of genes, termed RNA-Seq. RNA-Seq opens up new chances to improve genome annotation procedures, that currently still mispredict many gene structures which strongly impairs the downstream analysis of the predicted genes. RNA-Seq also poses new challenges, in particular because the sequence fragments are generally much shorter than in conventional technologies. Based on one of the currently most accurate gene finders, AUGUSTUS, a genome annotation pipeline that exploits RNA-Seq data will be developed. RNA-Seq data from the various new sequencing platforms will be integrated in addition to established evidence types such as various homology approaches and proteogenomics. The goal is to significantly reduce the error rate and to report alternative splice forms more sensitively and confidently. The universally applicable annotation pipeline will be freely available for independent use but also applied directly in some genome projects. If successful, this project would allow a significantly more accurate annotation of many eukaryotic genomes accompanied by transcriptome sequencing, and therefore provide a better basis for other studies with potential biomedical, biotechnological or agricultural applications.
DFG-Verfahren
Sachbeihilfen