Illumina, Inc.
Population Genetics Data Processing with DRAGEN Secondary Analysis
Pages
8
Time to read
13 mins
Publication
Language
English
Pages
8
Time to read
13 mins
Publication
Language
English
This technical note outlines recommendations for data analysis and variant calling in large-cohort population genetics studies using the DRAGEN secondary analysis platform. It discusses the importance of accurate cohort-level catalogs of variation for various genomic studies, including ancestry and genotype/phenotype associations. The document details a typical workflow for population genetics data processing, emphasizing the independent analysis of samples during read mapping and variant calling. It further explains the aggregation of gVCF files to create a conceptual matrix of genotypes and associated metrics. The note evaluates the performance of joint genotyping with DRAGEN in three use cases: high-coverage whole-genome sequencing (WGS), low-coverage WGS, and high-coverage whole-exome sequencing (WES). Benchmarking comparisons against GATK workflows are presented, highlighting the accuracy of DRAGEN in variant calling and the challenges associated with increasing sample sizes. Recommendations for obtaining analysis-ready variants using DRAGEN secondary analysis are also included.