Article Text

Review
Case for genome sequencing in infants and children with rare, undiagnosed or genetic diseases
1. David Bick1,
2. Marilyn Jones2,
3. Stacie L Taylor3,
4. Ryan J Taft3,
5. John Belmont3
1. 1 HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
2. 2 Rady Children's Hospital San Diego, San Diego, California, USA
3. 3 Illumina Inc, San Diego, California, USA
1. Correspondence to Dr David Bick, HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA; dbick{at}hudsonalpha.org

## Abstract

Up to 350 million people worldwide suffer from a rare disease, and while the individual diseases are rare, in aggregate they represent a substantial challenge to global health systems. The majority of rare disorders are genetic in origin, with children under the age of five disproportionately affected. As these conditions are difficult to identify clinically, genetic and genomic testing have become the backbone of diagnostic testing in this population. In the last 10 years, next-generation sequencing technologies have enabled testing of multiple disease genes simultaneously, ranging from targeted gene panels to exome sequencing (ES) and genome sequencing (GS). GS is quickly becoming a practical first-tier test, as cost decreases and performance improves. A growing number of studies demonstrate that GS can detect an unparalleled range of pathogenic abnormalities in a single laboratory workflow. GS has the potential to deliver unbiased, rapid and accurate molecular diagnoses to patients across diverse clinical indications and complex presentations. In this paper, we discuss clinical indications for testing and historical testing paradigms. Evidence supporting GS as a diagnostic tool is supported by superior genomic coverage, types of pathogenic variants detected, simpler laboratory workflow enabling shorter turnaround times, diagnostic and reanalysis yield, and impact on healthcare.

• clinical genome sequencing
• rare and undiagnosed
• genetic testing
• neonates
• pediatric

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

View Full Text

## Introduction

In 2017, the Global Genes Project1 estimated that 350 million people worldwide suffer from a rare disease. These diseases are individually rare, but in aggregate they affect 4%–8% of the population.2 3 Our curation of >3800 rare diseases listed by Orphanet indicates that ~80% are genetic or have genetic subtypes (online supplementary file 2). Children comprise approximately half of those affected by a genetic disease and 30% of these children do not live past their fifth birthday.

### Supplemental material

Medical and surgical management of birth defects and rare genetic diseases are disproportionately large contributors to paediatric hospitalisations and present an enormous challenge to patients and families as well as healthcare systems.4 Children with chronic complex conditions (defined by length of illness) had 11-fold greater hospitalisation charges compared with others in a retrospective study of US hospitalisations for children.5 In a study from Western Australia, patients with rare disease diagnoses constituted 2% of the population but accounted for 10.6% of total hospital charges.6 We recently conducted an analysis of US paediatric hospitalisation charges using data from 2012 and found that mean total costs were up to US$77 000 higher in neonates and US$17 000 in older children with rare disease–linked diagnostic codes, and that these had longer hospital stays and increased mortality.7

Because specific genetic diseases can be difficult to recognise based on clinical features alone, the use of genetic testing in the paediatric population is critical for diagnosis and treatment. In this review article, we review the indications for genetic and genomic testing and then detail the testing technologies that are used. With this background, we review the literature supporting GS as a first-tier test in children focusing on diagnostic yield, time-to-diagnosis, patient care, health outcomes and health economic impact.

## Clinical indications for genetic and genomic testing

Identification of patients appropriate for genetic testing has evolved substantially in the last 25 years, coincident with the ability to easily and cost-effectively test for an increasing number of disorders. The American College of Medical Genetics and Genomics (ACMG) has published indications for cases in which the use of genomic sequencing approaches should be considered as well as a policy statement discussing the clinical use of aetiological diagnosis via genetic and genomic testing.8 Genetic and genomic sequencing approaches should be considered in the clinical diagnostic assessment in several scenarios including those in which a patient presents with a likely genetic disorder but a single genetic diagnosis or specific targeted testing is not obvious. Expanding on this idea, we suggest some additional general features of genetic disorders that non-specialists may use to help recognise when genomic testing may be appropriate (box 1).

Box 1.

### Indications for genome sequencing

• The phenotype or family history data strongly implicate a genetic aetiology, but the phenotype does not correspond with a specific disorder for which a genetic test targeting a specific gene is available on a clinical basis.

• A clinical diagnosis of a disorder known to be caused by multiple genes (extensive locus heterogeneity).

• Clinical features which are insufficient by themselves to make a clinical diagnosis, but that are known to be associated with multiple genetic disorders.

• A clinical diagnosis of a known genetic disorder in which single gene or other targeted testing has been negative.

• The patient has an atypical clinical course for the disease under consideration (eg, unexpected severity, duration, failure of response to therapy, idiosyncratic drug reaction).

• Early onset of disorder typically seen in adulthood.

• Rare and specific clinical or laboratory abnormalities (eg, laboratory test results far outside of expected ranges, rare anatomical variants, etc).

• Atypical or complex combinations of clinical abnormalities or additional signs and symptoms not explained by a previous molecular diagnosis.

## Approaches to genetic testing

Prior to the introduction of next-generation sequencing (NGS), several technologies were used to identify the basis of genetic diseases. The oldest, G-banded karyotype analysis can detect structural and numerical chromosome aberrations as well as mosaicism. However, the diagnostic yield is limited because abnormalities below 5–10 megabases often go undetected.9 In the early 1990s, fluorescence in situ hybridisation (FISH) was developed. FISH detects genetic abnormalities below the threshold of the G-banded karyotype, facilitating the detection of submicroscopic events (eg, deletions adjacent to the telomeres) involved in disease. FISH, however, is constrained to assessment of chromosome regions that can be targeted by FISH probes and abnormalities below 50–300 kb cannot be detected.10

Karyotype and FISH have largely been replaced by chromosomal microarray (CMA), which allows for simultaneous evaluation of all chromosomes for copy number imbalances and in some instances uniparental isodisomy.11 The use of CMA has revealed that submicroscopic cytogenetic abnormalities are significant contributors to birth defects and neonatal neurological disorders.12 CMA is often offered as a first-tier test in the assessment of children presenting with diseases thought to have a genetic basis, such as multiple congenital abnormalities and developmental delay (DD) with a detection rate between 15% and 20% depending on the clinical indication.13–15 There are differences among CMA platforms that affect the range of abnormalities detected. Arrays based on standard sets of single-nucleotide polymorphisms may have somewhat lower resolution for particular genes compared with custom array comparative genomic hybridisation but are superior for the detection of copy number variant (CNV) mosaicism as well as clinically important copy neutral abnormalities, including uniparental isodisomy and close consanguinity. CMA cannot detect balanced chromosomal rearrangements such as balanced translocation and inversions.16 Dideoxy (Sanger) sequencing, developed in 1977,17 examines short stretches of DNA for single nucleotide variants (SNVs), small insertions and small deletions.

The ACMG affirms that the definitive diagnosis of a genetic disease has clinical use for individuals.8 Given the potential use of a diagnosis, the limitations of these standard techniques is of great concern and presents a major obstacle to progress in the management of patients with suspected genetic disease. For example, the respective diagnostic yields for CMA for common disorders such as autism spectrum disorder (ASD) or DD in the paediatric population are, on average, quite low (9.3%–13.1%)18 19 and the diagnostic rates for individuals with other commonly encountered but phenotypically non-specific diseases (eg, non-syndromic birth defects) may be similar or even lower.20

Time-to-diagnosis is an important metric to consider in children with rare genetic diseases, particularly critically ill neonates admitted to neonatal intensive care units and children admitted to other intensive care settings. Genetic disease may present fewer specific clinical features in this population and many neonates either die or are discharged before a diagnosis is obtained. Unfortunately, standard genetic testing strategies often involve a series of tests which can take weeks or sometimes months to complete. Diagnostic evaluation typically involves multiple specialist consultations, laboratory tests, imaging studies and tissue biopsies. The length of the diagnostic odyssey for rare diseases ranges from 5 to 7 years1 and for many is ultimately disappointing when a diagnosis is not achieved.21

Genomic medicine today features information obtained from ES and GS in disease diagnosis and management. Large gene sequencing panels and ES via NGS have altered the way new disease-causing genes are discovered in addition to reducing the time-to-diagnosis.3 22 To date, ES has been used extensively in both clinical and research settings.23–26 Large-scale use of GS is also underway through the Undiagnosed Diseases Network,27 the 100,000 Genomes Project28 and other national programmes. Rapidly falling costs and faster time-to-results afforded by NGS have driven clinical adoption.29 The first use of NGS for rare disease research occurred with the identification of genes responsible for Freeman-Sheldon syndrome, Miller syndrome and Schinzel-Giedion syndrome,30 31 and its clinical use was demonstrated in a patient who received a life-saving bone marrow transplant following diagnosis with NGS.32 Now, more than 180 novel genes involved in rare diseases are added each year to the list of known disease-causing genes,3 30 but this pace of discovery may be reaching a plateau. Going forward, greater emphasis will be placed on the completeness of the genetic diagnostic evaluation (identification of all disease-causing alleles) for rare disorders that are difficult to detect using standard genetic testing techniques or that require a combination of tests. Because GS can close some of this gap, it shows exciting potential for efficiently solving a greater proportion of rare disease cases.

## Genome sequencing for genetic disease diagnosis

GS has the ability to identify a variety of molecular aetiologies for genetic disease. We reviewed the abstracts of >2000 publications focusing on 36 studies (see online supplementary file 1 for the list and selection methodology) that address the most important laboratory and clinical factors that influence efficacy in diagnosis.

### Genome sequencing provides a superior exome

ES enables interrogation of the approximately 1% to 1.5% of the human genome which is protein-coding. ES is widely used in both research and clinical practice, with studies showing improved diagnostic yield compared with historical approaches33 34 in patients with undiagnosed neurodevelopmental disorders,35 children with intellectual disability (ID),36 37 ASD18 and many others. There is increasing evidence, however, that ES cannot capture the complete range of pathogenic variation across the exome. For example, ES only covers approximately 98% of the exome and results from a recent study suggest that the current standard of 120× coverage for ES may be insufficient for consistent breadth of coverage across the exome.34 For genes that the ACMG recommend be evaluated for secondary findings, Meienberg et al showed that GS provided 100% coverage of ACMG genes versus 75% by ES due to incomplete exon coverage of a number of exons in ACMG genes.38 Further, the same study found that for RefSeq genes, 9% of the first exons were not covered by ES while GS provided complete coverage. Additional studies show that GS coverage and variant calling is less affected by GC content,38 has more complete coverage of exons39 40 and has more even coverage than ES. As a result GS requires lower average coverage to obtain the same accuracy in variant calling compared with ES.41 Additionally, GS has less dispersion in the distribution of allele coverage allowing higher accuracy in calling heterozygous positions compared with ES.42

### Genome sequencing and detection of pathogenic variants

GS examines approximately 90% of the human genome43 offering a more comprehensive analysis than ES40 and a growing body of literature is demonstrating that GS can provide a molecular diagnosis in cases where ES did not (see online supplementary table 2 for examples).44–47 Intergenic and intronic pathogenic variants are growing in number and importance48 spanning from pathogenic SNVs to more complex variations.49 50 The detection of coding CNVs that are smaller than three exons may require GS because pathogenic single-exon CNVs are frequently missed by ES analyses.51 Further, balanced chromosomal rearrangements, importantly those with high recurrence risks (eg, insertion-translocations), are difficult to detect with CMA and ES, whereas they may be detected with GS.52 It should be noted that GS cannot detect all chromosomal abnormalities.53 GS data may be analysed to identify pathogenic repeat expansions54 and GS from libraries constructed using the Hi-C protocol or through other sequencing technologies can be used to produce a phased genome.55 For cases involving a recessive disorder, variants in cis can be distinguished from variants in trans obviating the need to test other family members to establish the phase of the variants. This technique can also detect chromosomal rearrangements.56

### Genome sequencing and mitochondrial disorders

Mitochondrial disorders are a phenotypically and genetically heterogeneous group of diseases making diagnosis particularly challenging. NGS-based tests of the mitochondrial genome can detect common and rare mitochondrial SNVs and deletions.57 NGS-based testing can also detect low levels of heteroplasmic changes which is difficult using conventional tests such as Sanger sequencing.57 58 NGS tests have been developed that evaluate both the mitochondrial genome and a panel of nuclear genes that cause mitochondrial disease in a single test.59 NGS-based mitochondrial genome analysis followed by ES can be effective. In one study, causative pathogenic variants in the mitochondrial genome were found in 20% of the patients while ES of nuclear genes found the aetiology in an additional 49% of the patients.58 GS data include both the mitochondrial genome and all of the data found in ES therefore making this a more efficient method to detect mitochondrial disorders.38 51 57 58 GS at 30–40× coverage, suitable for nuclear gene variant calling, results in 5000–9000× mitochondrial genome coverage and readily lends itself to mitochondrial SNV analysis >5% allele fraction (RJT, personal communication).

### Genome sequencing reduces time-to-diagnosis

In the paediatric and neonatal population with rare, undiagnosed or genetic disesease, reducing the time-to-diagnosis is important because the progression of disease can be rapid. There are approximately 4000 genes with a phenotype-causing mutation,60 and many present within the first 28 days of life. For neonatal intensive care unit admissions, serial genetic testing may be too slow for optimal clinical management. Additionally, the full clinical phenotype may not manifest in neonates given the early stage at which disease is suspected. Finally, a large degree of clinical and genetic heterogeneity is often noted in acutely ill neonates, which further contributes to the lack of a timely molecular diagnosis for suspected genetic diseases.

Recent technological developments in rapid GS have highlighted its use in children, particularly critically ill neonates. A few studies examining this population have reported time-to-diagnosis ranges of 5 to 8 days61 to as little as 19.5 hours.62 In these studies, the remarkable speed is due in part to advances in bioinformatics processes including the field-programmable gate array based pipeline and automation of the tertiary pipeline.63 In a direct comparison of GS with ES in children with undiagnosed neurodevelopmental disorders, Soden et al35 reported a significant difference between the two technologies wherein the time-to-diagnosis using rapid GS was significantly less than ES.

### Genome sequencing provides high diagnostic yield

Numerous studies have demonstrated the impact of NGS-based approaches on improved diagnostic yield. A recent systematic review comparing the diagnostic rates of NGS-based tests (ie, ES and GS) and CMA found that the former had significantly greater diagnostic use compared with the latter.43 Direct comparisons between the diagnostic yields of GS and ES are difficult to make as rates tend to vary based on a variety of factors including patient selection bias, clinical indication and the continual discovery of disease-causing variants (see table 1).24 Importantly, there are studies reporting higher diagnostic yield of GS over other tests (including standard genetic testing, CMA and ES) in children with severe intellectual disability,64 neurodevelopmental disorder,35 developmental delay of unknown aetiology,65 critically ill neonates61 66 and early infantile epileptic encephalopathy.46

Table 1

Select studies illustrating the diagnostic variability of genetic and genomic testing.

### Genome sequencing and reanalysis

NGS has significantly increased the pace of discovery of new disease–gene relationships allowing for increased diagnostic yields when GS and ES are reanalysed at a later time. Reanalysis yields increase with both GS and ES; however, this increase is oftentimes more pronounced in GS cases due in part to the fact that GS offers better coverage of coding exons than ES.67 It is important to acknowledge that significant improvements in exon capture have allowed for increased detection of pathogenic variants with ES. However, the argument can be made that obtaining a comprehensive data set with GS initially would negate the need to repeat ES with those technological improvements.

### Clinical use of genomic testing

In its narrowest sense, clinical use refers to the ability of a screening or diagnostic test to prevent or ameliorate adverse health outcomes such as mortality, morbidity or disability through the adoption of efficacious treatments conditioned on test results.68 Achieving a genetic diagnosis in children is important because it can lead to early and informed disease management and in some cases a life-saving intervention. For example, identification of two compound heterozygous deletions in a premature baby with refractory hypotension and anuria (a condition that is typically lethal) with NGS-targeted gene panels led to a treatment regimen that improved renal function such that only mild residual chronic renal failure symptoms were present later in life.69 Similarly, classification of epileptic seizures at the molecular level via genetic diagnosis can inform which antiepileptic treatment will produce the best results70 and sometimes reverse epileptogenic abnormalities.71 The clinical use of ES has been established,72 as evidenced by patients with epileptic encephalopathy in which physical therapy practices were altered and ineffective feeding behaviours were discontinued following diagnosis via ES.14

A growing number of studies have demonstrated the clinical use of GS. For example, in a small, retrospective study of critically ill infants that received diagnosis with rapid GS, 65% reported immediate clinical usefulness of the diagnosis, 20% received a diagnosis with strongly favourable effects on disease management and 30% began palliative care.61 73 Another study examined a clinically heterogeneous paediatric cohort in which diagnosis with GS had a significant impact on clinical care beyond genetic testing and included changes in disease management based on published management guidelines, case reports or known function of the involved genes.65 Similarly, in children with neurodevelopmental disorders of unknown aetiology, 49% reported a change in clinical care or impression of pathophysiology following diagnosis with GS.35 While each of the studies were small or moderate in size (n≤119), the data build on the established clinical use of ES. An increasing number of studies focus on the use of GS as a first-tier test, rather than a last resort solution demonstrating its clinical use.44 65 73 As more GS studies appear, the clinical use of GS versus ES will become clear.

### Health economic impact of genome sequencing

The duration of the diagnostic odyssey is closely related to healthcare costs. For patients with rare and undiagnosed genetic disease, the cost of a standard diagnostic work-up is high, as additional tests, procedures, some requiring general anaesthesia, and specialist consultations are required when prior analyses fail to provide a diagnosis. For example, one study of children with neurodevelopmental disorders in the USA estimated that the cost of tests prior to receiving an NGS-based diagnosis was US$19 100.35 Thus, a comprehensive NGS-based approach is more cost-effective than iterative single-gene testing. A recent microcosting study showed that a genomic sequencing care pathway, where genomic sequencing is performed when genetic disease is initially suspected, can provide an efficient and economical approach to arriving at a diagnosis saving healthcare dollars.74 Incorporation of ES earlier in the diagnostic journey resulted in an incremental cost savings of US$6838 per additional diagnosis compared with the standard diagnostic pathway in children suspected of having a monogenic disorder.75 Likewise, ES achieved more conclusive diagnoses than did the standard care pathway without incurring higher costs in a group of children with complex neurological disorders of suspected genetic origin.76 Cost of care estimates from a recent Undiagnosed Disease Network (UDN) study suggest that the UDN approach (in which 74% of diagnoses were made with ES or GS) has the potential to be cost-effective by avoiding an expensive diagnostic odyssey. For example, prior to acceptance to the UDN, the average cost of care was US$305 428 while the average cost of the UDN evaluation was US$18 903, representing 6% of the total cost.27

The cost of GS is currently higher than ES; however, it is important to keep in mind the advantages of GS (eg, detection of trinucleotide repeat diseases, CNVs, disorders of the mitochondrial genome) and therefore the added value of GS. In a microcosting study of children with ASD, the estimated cost of GS ($C2857) was more than that of CMA ($C744) and ES (\$C1655). The study points out that automation of GS allows many samples to be simultaneously processed resulting in reduced labour time compared with ES.77 The authors noted that the higher cost of GS was largely due to greater bioinformatics demand. Technological improvements in bioinformatics automation and interpretation are predicted to bring the cost of GS closer to that of ES.

When comparing the cost of GS and ES, it is important to consider the cost drivers for the different technologies: greater than 90% of the cost of GS is directly related to sequencing; with ES, the cost is mainly due to the DNA capture assay and associated labour.39 Over time, sequencing costs have greatly decreased,39 so performing GS early on in the diagnostic pathway may prove to be a less expensive alternative to performing CMA and later ES in certain disease populations.

### Incidental and secondary findings

There are important ethical implications associated with the clinical application of ES and GS, particularly in children. ES and GS frequently identify incidental or secondary findings—genetic variants of potential importance to the child or family that are unrelated to the diseases for which the testing is performed.78 Reporting incidental findings is controversial and has resulted in sometimes-conflicting policy recommendations.79 Some groups suggest returning pathogenic variants from a list of medically actionable genes with findings currently lacking an available therapeutic intervention left unreported. Others recommend offering pathogenic findings in treatable and untreatable disorders as well as carrier status for recessive diseases.80 This discussion is particularly relevant in the paediatric population as they are not considered legally competent when screened but will gain competence as they grow older.81 The fact that many adults choose not to have genetic testing when offered82 raises important concerns regarding future autonomy and privacy protection; however, an in-depth analysis of these issues is beyond the scope of this review.

The large number of variants that result from ES and GS represents a significant challenge to their use in routine clinical practice. Both commercial and laboratory-developed informatics tools have been developed that filter out all but a few hundred variants for manual review. Still, this can result in a time-consuming task.83 Informatics tools that ingest phenotypic information to generate a candidate gene list are appearing.84 Combining tools that filter variants with one that proposes a gene list should significantly reduce analysis time.

## Conclusions and future directions

In this review, we have investigated the evidence for GS as a first-line tool for the diagnosis of rare and undiagnosed genetic diseases. GS provides high diagnostic rates across a variety of molecular aetiologies and can reduce the length of the diagnostic odyssey—both of which have positive downstream health-economic benefits. Receiving a molecular diagnosis also has profound psychosocial impact on patients as well as their families as they can give a name to the disease and connect the family with other similarly affected patients.85 Finally, receiving a definitive diagnosis enables the use of disease-specific genetic counselling services that can influence both family planning and, in some cases, palliative care.86

The use of GS as a first-tier test rather than a ‘last resort’ would be beneficial to many populations, especially critically ill neonates where a rapid diagnosis is essential. Future research should explore the diseases and presentations in which rapid GS has the most diagnostic effectiveness and is most likely to affect acute disease management. There are far fewer publications using NGS-based diagnostic tools in the adult population. Adult patients seeking diagnosis of a suspected genetic disease presents increased diagnostic challenges because additional factors, such as ageing and environmental exposures, require critical consideration.87

Beyond rare Mendelian diseases, GS provides opportunities going forward to identify mosaicism,88 genetic disease modifiers,89 pharmacogenomic variants,90 uniparental disomy,91 polygenic risk scores,92 infectious diseases,93 blood groups,94 HLA genotypes95 and ancestry,96 many of which cannot be determined from ES.

Finally, the diagnostic yield of GS is expected to increase with the development of novel bioinformatics methods and with the growing detection and understanding of disease-causing variants in non-coding regions. In paediatric patients with rare and undiagnosed diseases, clinical implementation of GS as a first-line test has the potential to increase diagnostic yields, reduce the time to diagnosis and positively impact the clinical care pathway.

## Acknowledgments

The authors thank Kirsten Curnow for her editorial review and Kathy Rader for assistance with manuscript submission.

View Abstract

## Footnotes

• Contributors DB, MJ, RJT and JB were involved in the planning and developing of the main conceptual ideas. DB and ST conducted the literature searches and provided an initial draft with critical input from MJ, RJT and JB. All authors provided feedback and contributed to the final version of the manuscript.

• Competing interests DB: Envision Genomics—stock; Smith Family Clinic LLC—billing for care; Clinical Services Laboratory LLC—fee for clinical analysis; Genomics England—scientific advisory board. MJ: nothing to disclose. ST: current employee and shareholder of Illumina. RJT: current employee and shareholder of Illumina. JB: current employee and shareholder of Illumina.

• Provenance and peer review Not commissioned; externally peer reviewed.

## Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.