Computational Pipeline Can Analyze 1,000 Genomes a Day
A genome computational pipeline has achieved the remarkable throughput of 1,000 genomes, a speed that will enable population-scale genomics. As part of the Intel Heads In The Clouds Challenge, GenomeNext and Nationwide Children’s Hospital (Columbus, Ohio) were challenged to analyze a complete population dataset compiled by the 1000 Genomes Consortium in one week. The 1000 Genomes Project is the largest publicly available dataset of genomic sequences, with whole-genome and whole-exome samples from 2,504 individuals from around the world. All 5,008 samples were analyzed on GenomeNext’s genomic sequence analysis platform, operated on the Amazon Web Services Cloud and powered by Intel processors. The system achieved “unprecedented throughput” with as many as 1,000 genome samples being completed per day. The analysis of 1,000 genomes generated result files close to 100TB. Not only was there a high-degree of correlation with the original analysis performed by the 1000 Genomes Consortium, but additional variants were potentially discovered during the analysis. “The successful completion of this proof-of-concept not only sets a groundbreaking timeframe for the analysis of a massive quantity of genomic data, but demonstrates the utility of the GenomeNext solution, eliminating the sequence analysis computational bottlenecks, enabling researchers and clinicians to keep pace with processing […]
Subscribe to Clinical Diagnostics Insider to view
Start a Free Trial for immediate access to this article