Whole-genome approaches for causal variant detection in cattle
Final Report Abstract
Dense genotyping and whole-genome sequencing data are generated at an unprecedented scale in livestock populations. The number of genotyped animals recently surpassed 1 million and the number is likely to multiply within the next years. The sequenced animals can serve as a reference population to impute sequence variant genotypes in silico for animals that have been genotyped using dense genotyping arrays. Combining genotyping and whole-genome sequencing data (i.e., imputation) enables to compile large mapping populations thereby providing high power for genome-wide complex trait analysis. During the research fellowship, strategies to infer sequence variant genotypes from large reference panels were evaluated regarding their accuracy of imputation and computational efficiency. It turned out that using a reference population that includes animals from different breeds allows higher accuracy of imputation than within-breed reference populations particularly at low-frequency variants. Unexpectedly, a number of genomic regions were detected where the accurate imputation of sequence variant genotypes is not possible using current sequencing and imputation approaches. Next, the most accurate approach was applied to infer genotypes at more than 20 million sequence variants for more than 17,000 cattle from three breeds. Association tests between imputed sequence variant genotypes and daughter-based phenotypes for fat and protein percentages in milk revealed 25 QTL that mostly segregated across breeds. The ability to pinpoint causal trait variants was evaluated by assessing the power to detect well-characterized true causal mutation. The results of this project may be considered as a blueprint to infer sequence variant genotypes in livestock populations using large reference populations. Moreover, this research fellowship evidenced that validating the effect of known causal variants is crucial in order to assess the ability to detect true causal variants in association studies with imputed sequence variant genotypes. Upon completion of the research fellowship, the applicant took over the position of an Assistant Professor of Animal Genomics at ETH Zurich.
Publications
- (2017): Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution
Pausch H, Emmerling R, Gredler-Grandl B, Fries R, Daetwyler HD, Goddard ME
- Evaluation of the accuracy of imputed sequence variants and their utility for causal variant detection in cattle. Genetics Selection Evolution. 2017;49:24
Pausch H, MacLeod IM, Emmerling R, Fries R, Bowman PJ, Daetwyler HD, Goddard ME
(See online at https://doi.org/10.1186/s12711-017-0301-x)