Health care is being inundated with data—biologic data derived from advances in technology, namely next-generation sequencing and individual patient-level data found in electronic health records (EHRs) including imaging and laboratory test results, medical and prescription claims, and even monitoring information originating in remote, personal health devices. While the health care system is moving toward data-driven medicine to improve care and contain costs, clinicians can’t keep up with emerging genomic information in the literature and even data derived from individual patients. “We are trying to improve the speed and accuracy of diagnosis by using health information technology,” says Robert El-Kareh, M.D., an assistant professor of bioinformatics at the University of California, San Diego. “Physicians are under enormous time pressure, especially for outpatients, and there is an information overload with alerts.” He adds that adding genotypic information may add to that information overload so “some thought needs to be put into integrating information that has evidence so it can make a real clinical impact.” The realization of personalized medicine is dependent upon a technology-based transformation of the health care system. The technological challenge of sequencing an individual’s genome has been achieved. Now, the challenge migrates to analyzing raw genomic data for actionable […]
Health care is being inundated with data—biologic data derived from advances in technology, namely next-generation sequencing and individual patient-level data found in electronic health records (EHRs) including imaging and laboratory test results, medical and prescription claims, and even monitoring information originating in remote, personal health devices. While the health care system is moving toward data-driven medicine to improve care and contain costs, clinicians can’t keep up with emerging genomic information in the literature and even data derived from individual patients.
“We are trying to improve the speed and accuracy of diagnosis by using health information technology,” says Robert El-Kareh, M.D., an assistant professor of bioinformatics at the University of California, San Diego. “Physicians are under enormous time pressure, especially for outpatients, and there is an information overload with alerts.” He adds that adding genotypic information may add to that information overload so “some thought needs to be put into integrating information that has evidence so it can make a real clinical impact.”
The realization of personalized medicine is dependent upon a technology-based transformation of the health care system. The technological challenge of sequencing an individual’s genome has been achieved. Now, the challenge migrates to analyzing raw genomic data for actionable clinical use. Health information technology (IT) will drive the adoption of genomic-based medicine if it can surmount the big data challenge of processing, storing, and accessing relevant genomic data.
“When a doctor talks to a patient, they typically make their differential diagnosis and order the minimum number of tests that will allow them to pick the right diagnosis,” says Justin Starren, M.D., Ph.D., chief of health and biomedical informatics at Northwestern University (Chicago). “They already know how they are going to interpret that test, and the way that test is interpreted does not change over time. Whereas with whole-genome sequencing there is a huge amount of data and [clinicians] are interested right now in only one tiny piece of it. The question becomes what to do with everything else. There is a need for ancillary systems.”
Big data is big. Sequencing a single genome creates terabytes of raw data. Secondary analysis of data from assembled genome can cut the size by a thousandfold to gigabytes. Tertiary analysis then matches patient genotype with clinical relevance. Complicating current efforts is a lack of standardization at all steps along the way, making the process of whittling down big data to clinically manageable and accessible EHR data plain daunting—from the expense of maintaining servers, privacy concerns of cloud storage, and finding staff with bioinformatics expertise to analyze the data.
Emerging Prototypes
There are a variety of emerging scenarios of how to pair actionable genomic knowledge and patient data—whether EHRs incorporate and update clinical rules or whether patient data is uploaded to ancillary systems that are queried for relevant clinical information.
The Mayo Clinic is prospectively collecting DNA and sequencing roughly 85 genes used in pharmacogenomics for 1,000 patients predicted to need a common drug (statin, anti-coagulant) over the next five years. Christopher Chute, M.D., section head for medical informatics at Mayo, says this project serves two goals. First, it is a real-world trial for how a handful of four to five genotype variants validated with clinical measures can function in EHRs with decision support. When a prescription for one of the targeted drugs is made, decision support makes a recommendation to increase, decrease, or avoid a drug altogether, informed by genetic information. Second, though, the program is sequencing significantly more genes than are currently clinically utilized, which will provide a much richer data resource to investigate not yet recognized pharmacogenomics interactions.
Several experts point out the many parallels between genomic data and the evolution that occurred with digital radiology. Radiology had to develop a system with standard formatting and medical education for the reporting and storage of imaging results in the digital age. Imaging tools have advanced to produce “image slices” greater than an individual physician is able to comprehend, requiring visualization tools to aid in clinical diagnosis. Systematizing and storing the data is still evolving as EHRs are meaningfully used and connected in a systemwide fashion. Chute predicts similar types of data repositories that are connected to EHRs will emerge for genomic data, but only a subset of that data becomes part of the patient record. Consortiums, rather than individual provider systems, are working on deciding what is clinically actionable—a task that is rapidly evolving.
Starren predicts that prototype systems will emerge, many based on those being developed in clinical research settings, over the next two to five years, but it will likely be several years after the implementation of meaningful use 3.0 before there is widespread adoption, as few are willing to take on more than required IT workload right now.
How these big data solutions will interface with EHRs and practice-management tools on a larger scale remains unclear. But experts agree that for genomic-based personalized medicine to become a full-fledged reality, decision-support systems must converge multiple sources of clinically meaningful data.
El-Kareh adds that one of the biggest challenges of incorporating genomics into medicine is to make it accessible, and there are a lot of barriers to doing that. “EHRs are not suited to large data sets and it will require a fair amount of work on the technical side, but also with consideration of ethics and policies,” he says. “If genomic information is only available in one EHR system, what use is that if the sequence isn’t available everywhere. But, building systems to exchange health information have barriers—financial, ethical, and privacy issues. It is very complex.”
“Inertia and culture are not necessarily aligned with the needs of health care and society, but the genomic world has an opportunity to get our act together and agree on some aspects to make data consistent and comparable, while still allowing the research field to maintain flexibility to innovate,” Chute tells DTTR. “Sharing data is a way for society to benefit and improve health quality.”
Sidebar: NIH Seeking Associate Director for Data Science
As a sign of its commitment to embracing big data as a programmatic goal, the National Institutes of Health (NIH) is hiring an associate director for data science who will be the overall organizational leader in the broad areas of bioinformatics, computational biology, biomedical informatics, biostatistics, information science, and quantitative biology. Among the listed responsibilities will be to coordinate all NIH data-related science activities, partner with other agencies, as well as oversee the new NIGH Big Data to Knowledge (BD2K) initiative.
The position and BD2K both emerged from recommendations made by the Data and Informatics Working Group of the Advisory Committee. The BD2K initiative seeks to improve policies for sharing data and software and for cataloging data in part through the development of standards. Additionally, the NIH will launch the
InfrastructurePLUS program to advance high-performance computing, hosting, and storage approaches and to modernize the NIH network.
According to a McKinsey & Co. study, the NIH is not alone in looking for data experts. In the United States there will be a shortage of 140,000 to 190,000 people with analytical expertise in the coming years.
Sidebar: Ingenuity Systems Bought by Qiagen
Recognizing that more rapid analysis and interpretation of genomic data derived from next-generation sequencing will be a differentiating factor, at the end of April, Qiagen (Germany) acquired Ingenuity Systems (Redwood City, Calif.) for $105 million, using existing cash reserves. In 2012 Ingenuity had net sales of $12 million. The cornerstone of Ingenuity’s Web-based offerings is its Knowledge Base, a 14-year effort to curate, model, and computationally structure vast amounts of biomedical literature, including genomic variations. These data are leveraged to more quickly analyze and interpret data in research and clinical diagnostics.
“The interpretation of biological information is becoming a cornerstone of QIAGEN’s ecosystem of Sample & Assay Technologies for molecular testing—both in life sciences research and in diagnostics,” said Peer Schatz, Qiagen’s CEO, in a statement at the time of the acquisition. “We are looking forward to expanding the seamless integration of leading biomedical information solutions into our full range of molecular testing solutions, thereby providing our customers a unique experience from sample to interpreted result and recommendations for next steps.”
Sidebar: Commercial Analysis Companies Abound
Commercial applications for genome analysis are proliferating with many able to raise venture capital financing. According to a report from Mercom Capital Group there were 104 health IT-related venture capital funding deals during the first quarter of 2013, compared with 30 during the same quarter in 2012.
Experts do not expect a dominant player or consolidation to occur in this segment any time soon as the companies attempt to differentiate their offerings and justify that their proprietary software is better than freely available options. Below is a sampling of some of the commercial genome analysis companies.
Bina Technologies (Redwood City, Calif.) in March closed a $6.25 million Series B round of financing. The Bina Genomic Analysis Platform is an end-to-end service platform that combines high-performance computing with informatics algorithms enabling more rapid analysis of a whole genome. The company says a whole human genome can be analyzed in about four hours. The server can sit in a customer’s own data center.
DNAnexus (Mountain View, Calif.) offers a solution that combines a cloud computing infrastructure with scalable systems designs in which users can run their own algorithms. The company raised more than $15 million in October 2011 from investors including from Google Ventures.
GNS Healthcare (Cambridge, Mass.) applies industrial-scale data analytics to allow stakeholders across the health care industry to solve complex care, treatment, and cost challenges. The core of its approach is the REFS platform (Reverse Engineering and Forward Simulation), a scalable, supercomputer-enabled framework that automates the extraction of causal network models directly from observational data and uses high-throughput simulations to generate new knowledge.
NextBio (Santa Clara, Calif.) uses big data solutions to improve molecular data interpretation for clinical and research applications by integrating and interpreting public and proprietary molecular data with clinical patient information, population studies, and model organisms.
Personalis (Menlo Park, Calif.) offers sequencing services and interpretation for clinicians and pharmaceutical and biotechnology companies. The company was recently awarded a $1.53 million contract with the U.S. Department of Veterans Affairs to look for genetic variants in samples from as many as 1 million military veterans to explore the variants’ roles in disease. The sequencing will be performed by Illumina (San Diego).
RealTime Genomics (San Francisco) offers complete analytical platforms for genomics and metagenomics either installed in users’ computing environment or through cloud computing. In May the company secured $5 million to expand commercial operations. The company’s platform is used at the Stanford Center for Genomics and Personalized Medicine.
Seven Bridges Genomics (Cambridge, Mass.) is a Web-based platform that offers both validated and custom design pipelines, including from curated public data sources. It aims to be accessible to people with no expertise in bioinformatics.