Article Text

other Versions

Download PDFPDF
Original research
Lessons learnt from multifaceted diagnostic approaches to the first 150 families in Victoria’s Undiagnosed Diseases Program
  1. Thomas Cloney1,2,
  2. Lyndon Gallacher1,2,
  3. Lynn S Pais3,4,
  4. Natalie B Tan1,2,
  5. Alison Yeung1,2,
  6. Zornitza Stark1,2,
  7. Natasha J Brown1,2,
  8. George McGillivray1,
  9. Martin B Delatycki1,2,
  10. Michelle G de Silva1,
  11. Lilian Downie1,2,
  12. Chloe A Stutterd1,2,
  13. Justine Elliott1,
  14. Alison G Compton2,5,
  15. Alysia Lovgren3,6,7,
  16. Ralph Oertel1,
  17. David Francis1,
  18. Katrina M Bell1,8,
  19. Simon Sadedin1,2,
  20. Sze Chern Lim1,
  21. Guy Helman5,
  22. Cas Simons1,9,
  23. Daniel G Macarthur3,10,11,
  24. David R Thorburn1,2,5,
  25. Anne H O'Donnell-Luria3,4,12,
  26. John Christodoulou1,2,13,
  27. Susan M White1,2,
  28. Tiong Yang Tan1,2
  1. 1Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  2. 2Department of Paediatrics, The University of Melbourne, Melbourne, Victoria, Australia
  3. 3Center for Mendelian Genomics, Eli and Edythe L Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
  4. 4Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
  5. 5Brain and Mitochondrial Research Group, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  6. 6Analytic and Translational Genomics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA
  7. 7Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
  8. 8Bioinformatics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  9. 9Translational Bioinformatics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  10. 10Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
  11. 11Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  12. 12Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA
  13. 13Neurodevelopmental Genomics Research Group, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  1. Correspondence to Professor Tiong Yang Tan, Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Melbourne, VIC 3052, Australia; tiong.tan{at}vcgs.org.au

Abstract

Background Clinical exome sequencing typically achieves diagnostic yields of 30%–57.5% in individuals with monogenic rare diseases. Undiagnosed diseases programmes implement strategies to improve diagnostic outcomes for these individuals.

Aim We share the lessons learnt from the first 3 years of the Undiagnosed Diseases Program-Victoria, an Australian programme embedded within a clinical genetics service in the state of Victoria with a focus on paediatric rare diseases.

Methods We enrolled families who remained without a diagnosis after clinical genomic (panel, exome or genome) sequencing between 2016 and 2018. We used family-based exome sequencing (family ES), family-based genome sequencing (family GS), RNA sequencing (RNA-seq) and high-resolution chromosomal microarray (CMA) with research-based analysis.

Results In 150 families, we achieved a diagnosis or strong candidate in 64 (42.7%) (37 in known genes with a consistent phenotype, 3 in known genes with a novel phenotype and 24 in novel disease genes). Fifty-four diagnoses or strong candidates were made by family ES, six by family GS with RNA-seq, two by high-resolution CMA and two by data reanalysis.

Conclusion We share our lessons learnt from the programme. Flexible implementation of multiple strategies allowed for scalability and response to the availability of new technologies. Broad implementation of family ES with research-based analysis showed promising yields post a negative clinical singleton ES. RNA-seq offered multiple benefits in family ES-negative populations. International data sharing strategies were critical in facilitating collaborations to establish novel disease–gene associations. Finally, the integrated approach of a multiskilled, multidisciplinary team was fundamental to having diverse perspectives and strategic decision-making.

  • genomics
  • genetics
  • medical
  • paediatrics
  • genetic testing
  • genetic techniques

Data availability statement

Data are available upon reasonable request.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Collaborative diagnostic research initiatives are instrumental in the systematic approach to the diagnosis of rare genetic diseases. Pioneered in 2008 with the National Institute of Health’s (NIH) Undiagnosed Diseases Program (UDP),1 which has since evolved into the NIH’s Undiagnosed Diseases Network (UDN),2 diagnostic programmes have successfully used a range of technologies to investigate rare genetic diseases. The National Genomic Research Institute launched the Centers for Mendelian Genomics in 2012 to discover the genetic basis underlying Mendelian traits and accelerate discoveries by disseminating the obtained knowledge and effective approaches through international collaborations.3 These programmes are driven by experts in rare disease, have bridged research and clinical spaces to identify novel disease genes, broadened access to novel technologies and improved integration within the health system to develop appropriate referral pathways. Collectively, they serve as a crucial tool to meet the International Rare Diseases Research Consortium’s (IRDiRC) ambitious goals to improve diagnostic rates and reduce delay in the diagnosis of rare disease by 2027.4

Despite heterogeneity in approaches between programmes, the most widely used genomic technology is exome sequencing (ES), which has allowed for diagnostic rates of 25%–57.5% across a range of heterogeneous rare disease cohorts,5–8 and has accelerated novel disease gene discovery.9 Singleton ES—that is, sequencing of the affected proband only—is commonly used in the clinical setting due to its lower cost compared with family-based ES—that is, sequencing of the affected proband with additional family members (typically both biological parents and the proband as a trio). While clinical ES only examines genes already associated with the disease, research-based ES analyses in undiagnosed diseases programmes (UDPs) take a broader approach to identify pathogenic variant(s) in genes not previously known to be associated with human disease. Together with the use of global data sharing platforms10 and pathways to laboratory functional studies and animal models, these are key components of UDPs that connect researchers and facilitate novel gene discovery.11 There is significant heterogeneity between programmes in their eligibility criteria and diagnostic approaches (table 1). Some centres recruit individuals who have never had next-generation sequencing (NGS) investigations (panel, ES or genome sequencing (GS)) but have had other extensive investigations.12–14 Other programmes recruit individuals who may have already had NGS and outline approaches after a negative result, and some take a mixed approach by also accepting sequencing-naïve individuals.15–17 Phenotypic data collection also varies and may be obtained through a proband’s standard clinical course of care or require specific study visits to complete a detailed protocol.16 17

Table 1

Global diagnostic programmes for rare genetic diseases

At our centre, while clinical singleton ES achieves a diagnostic rate of 52%–57.5% across a range of paediatric phenotypes,7 8 this still leaves many affected individuals who undergo NGS without a diagnosis. These patients are offered streamlined recruitment to the Murdoch Children’s Research Institute’s (MCRI) Undiagnosed Diseases Program-Victoria, Australia (UDP-Vic), initially funded by philanthropic donation and established in collaboration with the Broad Center for Mendelian Genomics (Broad CMG) at the Broad Institute of MIT and Harvard, USA. The UDP-Vic recruits patients after a negative clinical NGS (panel, ES or GS) result and incorporates multifaceted case-specific analytic strategies including family ES, GS, RNA sequencing (RNA-seq) and high-resolution chromosomal microarray (CMA). The UDP-Vic team is multidisciplinary, with analysis of each case led by the recruiting clinical geneticist and integrated with the proband’s standard clinical care. Here we share our experiences and lessons learnt over the first 3 years of the programme.

Methods

Study design and population

We prospectively recruited individuals undiagnosed after a negative NGS (panel, ES or GS) result from a single centre, Victorian Clinical Genetics Services (VCGS) in Melbourne, Australia, from March 2016 to June 2018. The recruiting clinical geneticist proposed each individual at a clinical review panel, at which time the following criteria were assessed: (1) their phenotype was likely to be monogenic; (2) appropriate testing had been undertaken, including standard-resolution CMA (CytoSNP300K, Core Exome or Global Screening Array (GSA)); Illumina, San Diego USA) and singleton ES; (3) phenotypically relevant genomic lesions not tractable by ES had been excluded, for example, FMR1 triplet repeat analysis or methylation studies for imprinting disorders; and (4) additional family members for sequencing were available if appropriate. Written informed consent was obtained from each family.

Sequencing

The programme applied family ES as the primary investigation to all families, with the deployment of adjunct tests in a case-specific manner. Reanalysis of previously generated ES data was performed iteratively by analysts and the recruiting clinical geneticists. Family-based GS with accompanying fibroblast or muscle RNA-seq of the affected proband was performed in a subset of families who had already undergone non-diagnostic family ES. Individuals were selected for family-based GS and RNA-seq if there was a high ongoing suspicion of a monogenic condition, the family provided consent and appropriate samples were able to be obtained during the study period. Aberrant gene expression or mRNA splicing events identified from RNA-seq were correlated with nearby rare GS variants that may be causative of the change in gene expression or splicing.

All research-based sequencing—ES, GS and RNA-seq—and data processing were performed by the Genomics Platform at the Broad Institute of MIT and Harvard University. Additional bioinformatic analysis was performed at MCRI for RNA-seq analysis and mitochondrial genome variant calling. Further details are included in the online supplemental methods. The ES, GS and ES-based CNV call-set data were uploaded to an open-source web interface (seqr) for collaborative analysis between the Broad CMG and local investigators. Following analysis, each case was discussed in a multidisciplinary team teleconference comprising individuals from VCGS and Broad CMG, including bioinformaticians, genomic analysts, clinical geneticists, genetic counsellors and other disciplines such as cytogeneticists where this expertise was required.

We performed high-resolution CMA using Illumina Omni 5M SNP platform (Illumina) in cases where there was a clinical suspicion of an intragenic CNV not detected by standard-resolution CMA and ES. This array is validated to detect heterozygous copy number changes of 20 consecutive probes, giving an effective resolution of 1–10 kb. The high-resolution Omni arrays allowed for the detection of smaller structural variants andinsertions/deletions (INDELS) not detected by ES or the standard-resolution CMA required to enter the programme.18 Further details are included in the online supplemental methods.

Data collection

We collected data relating to phenotype, demographics, clinical investigations and outcomes from February to July 2019, with a further update in January 2021. Data were extracted manually from electronic medical records and genetic files and stored in PhenoTips19 and REDCap (Research Electronic Data Capture)20—electronic data capture tools hosted at the Murdoch Children’s Research Institute.

A molecular diagnosis, or strong candidate for primarily novel gene discoveries, was considered the endpoint of an individual’s diagnostic trajectory. To reach a molecular diagnosis in an established disease gene, the variant was required to reach the criteria for the American College of Medical Genetics and Genomics’ classification of ‘pathogenic’ or ‘likely pathogenic’.21 For novel disease gene variants and newly characterised (novel) but unpublished phenotypes of known disease genes, a ‘strong candidate diagnosis’ was proposed. We required a strong candidate diagnosis to meet the following three criteria: (1) phenotypically similar unrelated individuals (matched through data sharing platforms) with a variant in the same gene and population allele frequencies compatible with disease penetrance and inheritance pattern; (2) in vitro or in vivo functional validation, either planned or underway via local or external research collaborators; and (3) multidisciplinary agreement that variants in the proposed gene(s) were likely causative for the phenotype(s) and recommended for functional confirmation.

We also report families with a gene of uncertain significance (GUS) if a potential novel disease gene or known disease gene with a novel phenotype has been proposed, but with insufficient or conflicting evidence regarding a gene–disease association. Similar to strong candidates, a GUS required multidisciplinary agreement that variants in the proposed gene(s) are a plausible cause for the phenotype and warrant functional confirmation. However, additional evidence of phenotypically similar unrelated individuals with a variant in the same gene was lacking. We considered all probands with a variant in a GUS as undiagnosed.

Results

Diagnoses

From 2016 to 2018, 150 families with a suspected rare genetic disease were recruited to the UDP-Vic. Of these, 144 (96.0%) probands had a negative ES result prior to enrolment, of which 17 had a negative family ES result (1 duo, 15 trios, 1 quad). Of the remaining six probands, four were enrolled after an appropriate clinical gene panel showed no pathogenic variant(s), one proband was enrolled after a negative clinical GS result, and one proband was enrolled after reportedly negative research GS, but on further enquiry prior research sequencing had not actually occurred. In total, 144 of 150 (96.0%) probands were paediatric (less than 18 years of age) at the time of enrolment and 115 (76.7%) had neurodevelopmental phenotypes (table 2).

Table 2

Characteristics of the UDP-Vic

Overall, 64 of 150 (42.7%) families achieved a molecular diagnosis or strong candidate diagnosis (figure 1). Of these, 37 were in known disease genes with a phenotype consistent with others with pathogenic variant(s) in that gene. For the remaining 27 with a diagnosis or strong candidate diagnosis, 3 were in known genes with novel phenotypes and 24 were in novel disease genes. If we considered only the 144 families that had undergone ES prior to recruitment, rather than a panel (n=4), GS (n=1) or no previous sequencing (n=1), 62 (43.1%) achieved a molecular diagnosis or strong candidate diagnosis.

Figure 1

Diagnoses or strong candidate diagnoses by intake year and analytic approach. The figure shows the diagnoses or strong candidate diagnoses by intake year and diagnostic test. The total number of diagnoses or strong candidate diagnoses is also included by intake year. CMA, chromosomal microarray; ES, exome sequencing; GS, genome sequencing; RNA-seq, RNA sequencing.

As part of the UDP-Vic workflow, 15 disease gene manuscripts have been published (online supplemental table 1). Of the 24 novel gene discoveries, 9 have been published.22–30 A further two of our novel disease gene strong candidates matched with multiple research groups via Matchmaker Exchange and these manuscripts have been published.31 32 At the time of matching, these external groups’ manuscripts had sufficiently progressed such that the UDP-Vic was not part of their publication or discoveries. However, we annotate these discoveries as novel (see online supplemental table 2) as they were unpublished at the time of discovery within the UDP-Vic. The remaining 13 candidate novel genes are yet unpublished but are considered strong candidates with further functional studies underway. Of the three novel phenotype discoveries in known disease genes, two have been published,33 34 with one strong candidate awaiting publication. Finally, of the 37 diagnoses in known genes with known phenotypes, 4 have been published as part of descriptive cohorts.35–38

Of the remaining 86 families without a diagnosis or strong candidate diagnosis, 3 have variants in GUS (GTF3C1, FZD8, PSMD6). These variants have insufficient evidence to be a strong candidate as, to our knowledge, there are no matches with phenotypically similar unrelated individuals and considered undiagnosed.

Sequencing of other affected and unaffected family members (family ES)

Family-based ES made the majority of diagnoses in the cohort (54 of 64 diagnoses or strong candidate diagnoses; figure 2). In total, 142 families underwent family ES (128 trios, 11 quads and 3 quints) with a diagnostic rate of 38.0%. These 54 families achieved a molecular diagnosis or strong candidate diagnosis despite 52 having a negative ES result prior to recruitment. Of these 52, 22 were novel discoveries (19 novel disease genes and 3 known disease genes with novel phenotypes). The remaining 30 of 52 diagnoses were achieved by a variety of methods, including new literature confirming novel gene–disease associations after the initial negative clinical singletonES or family ES (6 families); small structural variants detected via CNV calling as part of the family ES pipeline (5 families, including 1 with new gene–disease association); improvements in coverage and updated bioinformatic pipeline on our ES platform compared with the initial ES prior to entry to the UDP-Vic (3 families: 1 with a mosaic variant, 1 with a variant in a repetitive genomic region and 1 with a variant in the mitochondrial genome); broader research-based analysis of family ES where genes that were initially masked on clinical analysis were reviewed (1 family); finally, application of family ES over singleton ES allowed improved curation of causative variants (15 families). Such was the case for the diagnosis of an inframe deletion in USP9X in a female patient in middle childhood with syndromic intellectual disability (FAM16; online supplemental table 2).38 The variant was present in the singleton ES data but not selected for curation as it was not recognised at the time that inframe deletions in USP9X might be a causative variant for this phenotype. The addition of parental data with family ES led to the recognition of this variant as de novo, which prompted further analysis and established the diagnosis.

Figure 2

Pathway to diagnoses or strong candidate diagnoses after entry into the UDP-Vic. The figure shows the investigations that reached a diagnosis or strong candidate diagnosis in the cohort of 150 families enrolled in the UDP-Vic. Note that six families who had undergone family ES prior to enrolment in the UDP underwent family GS and RNA-seq directly, rather than repeated family ES within the UDP-Vic. Research-based sequencing was performed at the Broad CMG. *High-resolution CMA was used before, alongside or after investigations at the Broad CMG; therefore, this testing option is not placed in connection with other testing options. However, molecular diagnosis via high-resolution CMA in two families both occur after a negative family ES result prior to the implementation of CNV calling within the family ES pipeline. CMA, chromosomal microarray; ES, exome sequencing; GS, genome sequencing; RNA-seq, RNA sequencing; UDP-Vic, Undiagnosed Diseases Program-Victoria.

Family-based GS and RNA-seq

Of the 64 diagnoses or strong candidate diagnoses, 6 were made by family GS and proband RNA-seq (figure 2). In total, 24 families underwent family GS (21 trios, 3 quads) and proband RNA-seq, leading to a diagnostic rate of 25.0% (6 of 24). In two families, the superior coverage of GS compared with ES led to the identification of a pathogenic variant. In a child with hypertelorism, congenital glaucoma, anterior segment dysgenesis and tetralogy of Fallot, a pathogenic variant in FOXC1 was identified after negative sES and family ES in two centres (FAM11; online supplemental table 2). The causative FOXC1 variant was not found on review of the ES data likely because the high guanine-cytosine content of the region makes it challenging to detect with ES. This was the only family in our cohort with an affected parent, which modified our variant search strategy with family sequencing data. In a child with severe global developmental delay, cortical visual impairment and infantile spasms, a de novo missense variant in the mitochondrial gene MT-ND1 was identified by updated bioinformatic analysis using family GS data (FAM149; online supplemental table 2).

RNA-seq was instrumental in resolving synonymous variants or deep intronic variants that led to aberrant splicing in the remaining four diagnoses or strong candidate diagnoses. First, in a child with severe global developmental delay, microcephaly, diaphragmatic hernia and talipes equinovarus born to non-consanguineous parents, RNA-seq on fibroblasts identified aberrant splicing and reduced expression of TRAPPC4 (FAM27; online supplemental table 2).37 A homozygous intronic variant resulted in either exon 3 skipping or a 40 nucleotide extension of exon 3; both aberrant transcripts were predicted to undergo nonsense-mediated decay. Second, in a child with optic atrophy, ophthalmoplegia, global developmental delay, episodic ataxia and symmetrical lesions in the brainstem and thalami on brain MRI and normal respiratory chain enzyme analysis on muscle, a missense variant and a synonymous splice site variant in NDUFV1 were detected in a compound heterozygous state by family GS (FAM42; online supplemental table 2). RNA-seq on fibroblasts demonstrated aberrant splicing with skipping of exon 8, leading to frameshift. This, combined with fibroblast respiratory chain enzyme analysis, confirmed the diagnosis of mitochondrial complex 1 deficiency (MIM: #618225). Third, in siblings with global developmental delay, microcephaly, sensorineural hearing impairment, radial ray defect and abnormal signal intensities on brain MRI born of fourth-degree consanguineous parents, RNA-seq detected an aberrant splicing event implicating a homozygous deep intronic variant in NDUFB10 (FAM4; online supplemental table 2), leading to inclusion of a cryptic exon and frameshift resulting in nonsense-mediated decay.27 Finally, in siblings with microcephaly, severe intellectual disability and ataxia born to healthy unaffected parents, compound heterozygous variants (c.6625C>T; p.(Arg2209Ter) and c.2610+5G>A) were identified in a strong candidate novel disease gene (TPR, FAM29). Both variants were initially identified on family ES and family GS; however, RNA-seq was able to show that the c.2610+5G>A variant disrupted splicing and shortening of exon 20, suggesting that the phenotype might be related to loss-of-function (Van Bergen et al39).

High-resolution CMA

Of the 64 diagnoses or strong candidate diagnoses, 2 molecular diagnoses were made by high-resolution CMA, neither of which was detected on standard CMA (figure 2). High-resolution CMA was performed in 38 families, yielding a diagnostic rate of 5.2% (2 of 38). First, a deletion in exon 10 of MSL3 (FAM18; online supplemental table 2) identified on systematic review of X chromosome copy number losses in a male patient, in conjunction with global data sharing, led to collaboration and identification of a novel disease–gene association.24 Second, in a male patient in late adolescence with clinical features suggestive of Rubinstein-Taybi syndrome (MIM #180849), we identified a de novo chromosome 16p13.3 intragenic deletion of exon 2 of CREBBP. Both diagnoses had been missed by ES; however, in both cases, the family ES was conducted prior to the implementation of our CNV calling pipeline. Retrospective application of CNV calling confirmed the deletion was visible in the family ES data of both families.

ES reanalysis

Finally, of the 64 diagnoses or strong candidate diagnoses, 2 were made only by reanalysis of previously generated singleton ES data (figure 2). Singleton ES reanalysis was undertaken in an ad hoc manner in our cohort, usually by their recruiting clinical geneticist, and only performed in 15 probands, leading to a diagnostic rate of 13.3%. The primary ES data were reanalysed 1–17 months after the initial negative ES result. First, one diagnostic variant in SLC2A1 (FAM78; online supplemental table 2) was identified after updating the phenotypic data with the primary clinician. Second, in a female toddler with global developmental delay, microcephaly, spasticity and abnormality of the cerebral white matter, a strong candidate variant in PYCR3 was identified after an analysis of clinical singleton ES data was expanded to investigate potential novel disease genes using a research-based approach (FAM81; online supplemental table 2). The variant was established to be de novo by Sanger sequencing.

Diagnostic odyssey

We sought to understand the diagnostic trajectories of families recruited to the UDP-Vic in order to inform our counselling and management of expectations. The median time from the negative NGS result before UDP-Vic recruitment to the establishment of a diagnosis or strong candidate diagnosis was 1.42 years (IQR 1.03–2.48 years). The median time from first genetics consultation to diagnosis or strong candidate diagnosis was 5.14 years (IQR 2.71–7.31 years). As families recruited early in the research programme have been studied for longer periods, each intake year has reached a higher diagnostic yield than the following (55.0% for 2016, 52.8% for 2017, 24.6% for 2018) (figure 1).

Data sharing

Data sharing, primarily through Matchmaker Exchange,10 was undertaken for variants in 72 (48.0%) families identified by research-based sequencing, of which 51 (70.8%) families reached at least one match. Of all 26 novel diagnoses and strong candidate diagnoses (variants in novel disease genes and known genes with novel phenotypes), 19 (73.1%) potentially causative variants were matched and pathogenicity confirmed in unrelated individuals using Matchmaker Exchange. Matches through international data sharing networks have led to 10 publications to date.22–24 26 28–30 33 34 38 The remaining have functional studies underway or are the subject of manuscripts in preparation.

Discussion

We report here on the first 3 years, 150 families enrolled and 64 (42.7%) diagnoses or strong candidate diagnoses in our UDP, considering both our successes and potential areas for improvement. By analysing each of our testing strategies, our organisational and personnel structure, and the timelines involved in testing, we share the lessons we have learnt from the programme.

Lesson 1:Flexible implementation of testing strategies allowed for scalability and response to the availability of new technologies

The stepwise approach using multiple testing methodologies developed iteratively as the programme grew in scope and provided multiple benefits. First, it allowed for improvements in the scalability of the process. The UDP-Vic requires considerable resources as clinicians, analysts and other researchers are extensively involved in all stages from recruitment to diagnosis. By careful selection of a subset of the cohort to access more resource-intensive testing, such as family GS and RNA-seq, we were able to scale the programme quickly without overwhelming our systems. Second, it allowed the programme to respond as different technologies emerged or became achievable in our setting. This shift was clearly observed when high-resolution CMAs became available at our centre. We successfully demonstrated their utility in detecting smaller CNVs missed by ES in two probands in our first intake cohort, such that by the third intake year high-resolution CMA analysis was frequently applied within our centre earlier in the diagnostic work-up before family ES. This iterative and adaptive approach was also found to be successful in the study of the Duke/Columbia site of the UDN.16 Looking towards the future, as family ES becomes more widely available as a first-line clinical test in suspected rare genetic diseases,40 we anticipate a shift of the UDP-Vic to perform family GS with RNA-seq after negative family ES and copy number analysis.

Lesson 2:Family ES with research-based analysis provided the highest diagnostic yield

We found family ES to achieve a high diagnostic yield (38.0%) in our primarily singleton ES-negative population. While this study was not designed to directly compare rates Fbetween testing methodologies, family ES was effective at prioritising de novo variants and thus allowed for diagnostic improvements that justified its use instead of solely employing ES reanalysis or moving directly to family GS. Such was the case in the diagnosis of a de novo variant in USP9X (FAM16; online supplemental table 2), a diagnosis where the use of family ES added additional diagnostic evidence of a variant that is an inframe deletion in an X chromosome gene in a female patient. Family ES was especially helpful in identifying novel disease gene candidates that could be considered for submission to data sharing services. ES reanalysis also proved useful with our diagnostic rate of 13.3% similar to the median rate of 15% reported by others and reviewed previously.41 As seven diagnoses made by family ES were aided by new information published following an initial negative clinical exome, reanalysis of singleton ES data after 18 months could be an alternative strategy to family ES in cost-constrained settings.

While consensus exists that ES is a first-line diagnostic tool for certain phenotypes,40 the choice of either singleton or family-based sequencing strategy is likely to be context-dependent.42 At our centre clinical singleton ES achieves a reasonably high diagnostic rate (52%–57.5%)7 8 and family ES is less frequently deployed as first-line due to cost considerations. However, with the lessons learnt in the UDP-Vic and incremental gains in diagnostic yield from family ES,43 our practice continues to evolve. Where funding allows for family ES as a diagnostic test, data from this study and others would support its use, especially for undifferentiated and complex phenotypes.

After a negative family ES result, GS was most useful in detecting causative variants when coupled with RNA-seq. This combined approach allowed the correlation of aberrant splicing events with deep intronic variants not detectable by family ES. Only two of the six diagnoses made by family GS did not require RNA-seq. We were surprised to not identify more individuals with pathogenic CNVs on GS data, but this may change as we increasingly use this strategy. The incremental diagnostic potential of GS over ES44 comes at a cost, both financially and in data processing resources.45 As GS reference libraries and pipelines continue to improve, family GS with RNA-seq may become the research test of choice, particularly as family ES is increasingly used in clinical settings.

Lesson 3: RNA-seq offers utility in family ES-negative populations

RNA-seq played two important roles in advancing our path towards diagnosis in some individuals. First, similar to the University of California-Los Angeles clinical site of the UDN,15 it was useful in investigating variants of uncertain significance from ES or GS by assisting in the interpretation and validation of abnormal splicing due to intronic or splice site variants. Second, RNA-seq was able to act as an a priori diagnostic tool when no candidate had previously been proposed by ES or GS, as was the case for the deep intronic variant in NDUFB10 (FAM4; online supplemental table 2). RNA-seq has a limitation of often requiring biologically relevant tissues to be sampled, and in our cohort introduced delays and invasive biopsies in acquiring fibroblast samples from probands. Given our predominantly neurodevelopmental cohort, we may not be interrogating the most appropriate tissue (nervous tissue) and we may increase the diagnostic yield of RNA-seq if this tissue was accessible. Regardless, we highlight the benefits of using RNA-seq in family ES-negative populations within a UDP.

Lesson 4: Analysis informed by deep phenotyping and knowledge of disease mechanisms underpinned many of our novel disease gene discoveries

Many of our novel discoveries were clinically driven and supported by expertise in disease mechanisms. Such was the case in our diagnosis of biallelic variants in ADARB1 (FAM8; online supplemental table 2). The proband’s phenotype of microcephaly, severe global developmental delay and seizures had sufficient overlap with other RNA editing syndromes,46 leading to the recognition of ADARB1, which has a role in RNA editing, as a clear candidate. Data sharing and international collaboration led to a novel disease gene discovery.28 Similarly, in another proband whose phenotypic features were strongly suggestive of a ciliopathy, the homozygous variant in SMO, a gene implicated in SHH-GLI signalling, was considered a novel candidate to pursue even though mosaic monoallelic variants in SMO had previously been associated with the Curry-Jones syndrome (MIM #601707). This again led to international collaboration and the discovery of a novel phenotype associated with biallelic variants in SMO (FAM7; online supplemental table 2).33

Lesson 5:Data sharing was critical in facilitating rapid collaborations to establish novel disease–gene associations

Research-based sequencing with the use of data sharing tools and functional work was able to accelerate novel discoveries. In particular, we found that international collaboration via Matchmaker Exchange contributed to 19 diagnoses or strong candidate diagnoses, similar to other heterogeneous populations with suspected Mendelian disease.47 For the three GUS and other potential variants in undiagnosed families, data sharing strategies will be a critical tool in order to reach a diagnosis. Integrating these translational tools into a UDP workflow is critical to the acceleration of novel discoveries, an essential component in the diagnosis of rare diseases.

Lesson 6:Allowing for the passage of time, and the rapid change in rare disease literature, led to additional diagnoses in families previously without a diagnosis

The diagnosis of rare diseases is a rapidly moving field, with ongoing novel disease gene discoveries, disease gene phenotype expansions, improvements in referral pathways and systems, and implementation of novel technologies. Seven of our diagnoses in known genes were made in genes associated with human disease after the previous negative exome and before resequencing within the UDP-Vic. The cutting-edge nature of rare disease research is such that novel discoveries are continually being made.48 Two of our diagnoses with novel disease genes were concurrently being published by external groups at the time of matching via data sharing networks, and these manuscripts had sufficiently progressed such that the UDP-Vic was not part of their publication or discoveries. While we did not contribute to these novel manuscripts, we mention these diagnoses to highlight the timely nature of novel discoveries and the rate of change within the field.

The time that passed from initial clinical exome to research-based analysis allowed for this new literature to be published, and additionally allowed new clinical information to manifest, potential candidates to be analysed over multiple multidisciplinary meetings, matching on data sharing platforms, exchange of information between multiple international research groups and functional studies to be performed. Our diagnostic rate of probands in the first intake year continued to increase in year 3 and will likely continue to increase.47 While our time to diagnosis falls short of the IRDiRC’s goal of within 1 year if their disorder is present in the medical literature,4 programmes such as the UDP-Vic pave the way for clinical translation of multifaceted analytic strategies in rare disease diagnostics. Improvements in diagnostic timelines may now lie at two bottlenecks—recognition of rare disease by general clinicians and thus early referral to genetics centres, a timeline which has not been measured in this study, and efficiently improving diagnostic pathways within such programmes.

Lesson 7:Multidisciplinary expertise was fundamental to having diverse perspectives and strategic decision-making

Recruitment, data analysis and testing strategy decision-making occurred within a multidisciplinary team. Such a process allowed for a wide range of specialist perspectives and different approaches tailored to the undiagnosed case. Inclusion of the treating clinician in variant curation increases the diagnostic yield owing to a deep knowledge of the phenotype, as well as centralised analysis with access to phenotypic data via medical records.49 50 Expanding this to subspecialist analysis by bioinformaticians and cytogeneticists enriched our diagnostic capacities. An example of this was FAM18, who remained undiagnosed after singleton and family ES. Analysis by high-resolution CMA and manual analysis by a cytogeneticist on our team led to the recognition of a novel X chromosome CNV in MSL3, a novel gene at the time. Our multidisciplinary international team met monthly, and all processes were embedded within our clinical service, facilitating communication and rapid translation of findings. This also led to upskilling of all team members, including trainees, in the different strategies required for successful outcomes.

Limitations

We have reported our experiences with our UDP and acknowledge our approach involves a relatively small number of primarily paediatric patients with syndromic neurodevelopmental phenotypes. While the aim was to select patients most likely to benefit from such a programme, diagnostic yields may not be replicable at other services with phenotypically different patients. Our cohort was already extensively investigated, with the majority (96%) having already received a negative singleton ES result, a point of difference compared with many UDPs. Our genetic investigations occurred over an extended time course and were resource-intensive, a process that requires substantial investment for any clinical service. Additionally, comparisons of diagnostic yields per genetic testing strategy are difficult given different subsets of the cohort had undergone each test. In its current format, scalability is a major hurdle to overcome as our clinical and research teams are small and few steps in our process are automated.

Ongoing diagnostic odyssey

There is little consensus on the optimal strategies for the remaining 86 families of our cohort who remain without a diagnosis.51 For some, continued ES reanalysis with research-based strategies and data sharing networks may reach a molecular diagnosis.52 For others, many potential strategies may be on the horizon. While RNA-seq is particularly suited to gene expression analysis, other ‘omics’ analytic modalities could complement genomic testing where gene expression is not affected. Proteomics, metabolomics and epigenomics could all play adjunctive roles in improving the likelihood of elucidating the diagnostic cause.53 54 Alternatively, long-read genome sequencing (LRS) promises to address the shortcomings of short-read sequencing technologies.55

Conclusion

UDPs provide a concentration of expertise—structural, technological and workforce—in improving outcomes in the complex task of diagnosing rare diseases. Our experience in the first 3 years of the UDP-Vic suggests that (1) stepwise approaches are useful in the flexibility and scalability of a UDP, allowing for the incorporation of new testing modalities over time; (2) family ES is highly effective in achieving a diagnosis following a negative singleton ES result; (3) RNA-seq in conjunction with GS offers significant benefit after a negative family ES result, but the incremental diagnostic gain of family GS alone requires further investigation; (4) inbuilt adjunctive research-based methods such as the use of international data sharing strategies and confirmatory functional studies are critical to novel diagnoses; (5) extended timelines may be necessary for some novel diagnoses; and finally (6) diverse multidisciplinary perspectives were fundamental to strategic decision-making about diagnostic approaches to each case. For those who remain undiagnosed even after the above methodologies, future technology such as LRS and the promise of ‘omics’ technologies provide significant hope. Further research is necessary on the impact of a diagnosis on medical management from such programmes, the psychosocial impacts such programmes have on their participants, and economic analysis of our approach of implementing a UDP fully integrated with an outpatient genetics service compared with necessitating travel to dedicated centres such as those of the UDN for completion of a detailed research protocol.16 17 Our findings provide additional perspectives to implementing a programme in the diagnosis of rare genetic diseases, emphasise the high diagnostic rate after negative singleton ES achievable in a UDP, and highlight the value of a multidisciplinary team using different diagnostic technologies while fully interdigitated with the clinical service.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

Ethics approval was granted by the Royal Children’s Hospital Ethics Committee (HREC 36291A).

Acknowledgments

The Illumina Omni 5M SNP platform was donated at no cost by Illumina Australia. This work was completed as fulfilment of TC’s University of Melbourne MD Research Project. Additionally, GATK-gCNV calls on ES data were kindly generated by Isaac Wong, Jack Fu, Harrison Brand and Michael Talkowski.

References

Footnotes

  • Twitter @LynnPais, @thorburn_mito

  • Contributors Data collection: TC, LG, NBT, AY, ZS, NJB, GM, MD, MGdS, LD, CSt, JE, SMW, TYT. Data analysis: TC, LG, LSP, NBT, AY, ZS, NJB, GM, MD, LD, CSt, AGC, AL, RO, DF, KMB, SS, SCL, GH, CSi, DM, DRT, AHO’D-L, JC, SMW, TYT. Research oversight, supervision and direction: DM, DRT, AHO’D-L, JC, SMW, TYT. Manuscript writing: TC, LG, SW, TYT. Data and manuscript review: all authors. TYT accepts responsibility as guarantor for the overall content of this manuscript.

  • Funding Funding for sequencing and analysis was provided by the National Human Genome Research Institute, the National Eye Institute, and the National Heart, Lung, and Blood Institute Center for Mendelian Genomics (grant UM1 HG008900) and by the National Human Genome Research Institute (grant R01 HG009141). We acknowledge financial support from the Murdoch Children’s Research Institute and the Harbig Foundation. Research conducted at the Murdoch Children’s Research Institute was supported by the Victorian Government’s Operational Infrastructure Support Program.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.