The power of ancient retroposed sequences to resolve problematic mammalian evolutionary questions
Final Report Abstract
Retroposon insertions are highly valuable phylogenetic markers. Compared to other phylogenetic signals they are virtually homoplasy-free. That means that the presence of confounding markers in apparently conflicting presence/absence insertion patterns are more likely the direct result of ancestral ILS or hybridization, and therefore a strong indicator of hemiplasious signals. These so-called “conflicting” markers normally inserted into genomes during short internodes of evolution/speciation that were not long enough (usually 1-5 million years are necessary) for them to become fixed in a given population, and hence continued to be present polymorphically beyond speciation boundaries and to distribute these presence/absence states randomly into descendent lineages. This particular DFG-supported project built upon our previous project of mammalian evolution, and was designed to complete the major steps toward genome-wide analyses in many hitherto unsolvable areas of phylogenetic trees. Its major focus was on ILS and was intended to highlight retroposon studies in many genome sequencing projects and to develop professional tools to extract and analyze them. The complex, mosaic-like evolution of Laurasiatheria, the first retroposon study of the extinct Tasmanian tiger, and the now resolved placement of the beaver within rodents are some of the highlights. In addition, our elaborated genome analyses within the framework of large genome sequencing consortia focusing on vervet, tarsier, colugo, and crocodiles demonstrated a successful integration of retroposon data analysis in upto-date phylogeny and studies of genome plasticity. I think the project contributed directly to a world-wide recognition of this exceptionally versatile and reliable marker system in vertebrates. The project enabled us to establish cornerstone results in the field of molecular phylogenomics. Following the publication of these results and the development of new web-based bioinformatics tools for analyzing retroposon data sets, we were able to successfully join further important genome sequencing/analysis projects that are in progress. Encouraged by our own tarsier genome project, we intend to continue to coordinate such international, multigroup projects. The next step is to invest more in high-throughput data generation and strategies for analyses. We are currently conducting several new projects to test the power of PacBio sequencing data in retrophylogenomics, and already have very promising, exciting preliminary results for a new step in retrogenomics. We are also investing in short, paired-end Illumina reads and the possibilities to derive genome-level retroposon data from rough genome assemblies. The essential challenging part is to analyze such data via customized, user-friendly web applications that are freely available to the scientific community. This is already in progress with our new DFG-supported project: Computational tools for comparative retrophylogenomic studies. Automated network analyses to filter useful signals from a variety of noise is one step to bring genome-extracted transposed elements to standard diagnostics to resolve phylogenomic relationships within deep evolutionary branches as well as in population and individual genetics. This is also a very important step to elucidate many encrypted functional aspects of transposed elements as enhancers or controllers of gene function, novel genes, and disease-causing units, and is an important connection to the relatively recently established Münster Graduate School of Evolution (MGSE) and the Research Training Group (RTG 2220) at the Westfälische Wilhelms-Universität Münster.
Publications
- Exploring massive incomplete lineage sorting in arctoids (Laurasiatheria, Carnivora). Mol Biol Evol 32:3194-3204, 2015
Doronina L, Churakov G, Shi J, Brosius J, Baertsch R, Clawson H, Schmitz J
(See online at https://doi.org/10.1093/molbev/msv188) - GPAC - Genome Presence/Absence Compiler: A Web Application to comparatively visualize multiple genome-level changes. Mol Biol Evol 32:275-286, 2015
Noll A, Grundmann N, Churakov G, Brosius J, Makałowski W, Schmitz J
(See online at https://doi.org/10.1093/molbev/msu276) - Genome sequence of the basal haplorrhine primate Tarsius syrichta reveals unusual insertions. Nat Commun 7:12997, 2016
Schmitz J, Noll A, Raabe C, Churakov G, Voss R, Kiefmann M, Rozhdestvensky T, Brosius J, Baertsch R, Clawson H, Roos C, Zimin A, Minx P, Montague MJ, Wilson RK, Warren WC
(See online at https://doi.org/10.1038/ncomms12997) - Genomic analysis reveals hidden biodiversity within colugos, the sister group to primates. Sci Adv 2:e1600633, 2016
Mason VC, Li G, Minx P, Schmitz J, Churakov G, Doronina L, Melin AD, Dominy NJ, Lim NT-L, Springer MS, Wilson RK, Warren WC, Helgen KM, Murphy WJ
(See online at https://doi.org/10.1126/sciadv.1600633) - Incomplete Lineage Sorting and Hybridization Statistics for Large-Scale Retroposon Insertion Data. PLoS Comp Biol 12:e1004812, 2016
Kuritzin A, Kischka T, Schmitz J, Churakov G
(See online at https://doi.org/10.1371/journal.pcbi.1004812) - (2018) Genome of the Tasmanian tiger provides insights into the evolution and demography of an extinct marsupial carnivore. Nat Ecol Evol 2 182–192
Feigin, Charles Y.; Newton, Axel H.; Doronina, Liliya; Schmitz, Jürgen; Hipsley, Christy A.; Mitchell, Kieren J.; Gower, Graham; Llamas, Bastien; Soubrier, Julien; Heider, Thomas N.; Menzies, Brandon R.; Cooper, Alan; O'Neill, Rachel J.; Pask, Andrew J.
(See online at https://doi.org/10.1038/s41559-017-0417-y) - Speciation network in Laurasiatheria retrophylogenomic signals. Genome Res, 2017
Doronina L, Churakov G, Shi J, Baertsch R, Clawson H, Schmitz J
(See online at https://doi.org/10.1101/gr.210948.116) - The beaver’s phylogenetic lineage illuminated from retroposon reads. Sci Rep 7:43562, 2017
Doronina L, Matzke A, Churakov G, Stoll M, Huge A, Schmitz J
(See online at https://doi.org/10.1038/srep43562)