Background Multiple monogenetic conditions with partially overlapping phenotypes can present with inflammatory bowel disease (IBD)-like intestinal inflammation. With novel genotype-specific therapies emerging, establishing a molecular diagnosis is becoming increasingly important.
Design We have introduced targeted next-generation sequencing (NGS) technology as a prospective screening tool in children with very early onset IBD (VEOIBD). We evaluated the coverage of 40 VEOIBD genes in two separate cohorts undergoing targeted gene panel sequencing (TGPS) (n=25) and whole exome sequencing (WES) (n=20).
Results TGPS revealed causative mutations in four genes (IL10RA, EPCAM, TTC37 and SKIV2L) discovered unexpected phenotypes and directly influenced clinical decision making by supporting as well as avoiding haematopoietic stem cell transplantation. TGPS resulted in significantly higher median coverage when compared with WES, fewer coverage deficiencies and improved variant detection across established VEOIBD genes.
Conclusions Excluding or confirming known VEOIBD genotypes should be considered early in the disease course in all cases of therapy-refractory VEOIBD, as it can have a direct impact on patient management. To combine both described NGS technologies would compensate for the limitations of WES for disease-specific application while offering the opportunity for novel gene discovery in the research setting.
- Genetic screening/counselling
- Inflammatory bowel disease
- Molecular genetics
Statistics from Altmetric.com
Inflammatory bowel disease (IBD) is characterised by chronic inflammation of the gut, which results in diarrhoea, bloody stools, abdominal pain, growth failure and weight loss.1 Very early onset IBD (VEOIBD), refers to children with disease onset before the age of 6 years.2 The presentation of IBD in early childhood is uncommon: epidemiological data in this age group estimate a worldwide incidence of 4.37/100 000 children/year and a worldwide prevalence of 14/100 000 children.3 A diverse group of monogenic conditions can present as VEOIBD.4 Early molecular differentiation has become increasingly important to determine the correct treatment pathway such as early haematopoietic stem cell transplantation (HSCT) in patients with IL10 signalling defects,5 XIAP6 or FOXP3 deficiency.7 Considering the orphan nature3 and phenotype heterogeneity of VEOIBD, detecting a causative gene is challenging and can be both time and resource consuming. As a consequence, next-generation sequencing (NGS) technologies such as whole exome sequencing (WES) have been increasingly applied in patients with VEOIBD. Exome-wide screening has been proven invaluable as it revealed novel genotypes8 and atypical phenotypes of known monogenic conditions.9 However, despite all efforts to ensure good overall exome coverage at reasonable costs, sequencing deficiencies persist in some genomic areas. An alternative NGS technology to WES is targeted gene panel sequencing (TGPS). It aims to enhance the capture of a specific selection of genes thus providing consistent and reliable sequence coverage.
To deploy NGS technologies in patients with VEOIBD, in addition to deep phenotyping, has not yet become routine practise. We, therefore, hypothesised that establishing a genotype timely in the disease course can have a significant impact on patient management.
In this study, we used TGPS as the first-line molecular diagnostic tool in children with VEOIBD and discuss the established genetic diagnoses and their clinical implications. Furthermore, we evaluated the sequencing accuracy from TGPS and WES for 40 VEOIBD genes and highlight specific technology-related weaknesses, which should be taken into consideration when applying either technology in the diagnostic setting.
Material and methods
Over a 12-month period, 25 consecutive children with VEOIBD were prospectively recruited for TGPS. Three patients with previously established genetic diagnoses (patients 5–7, table 1) were all confirmed by TGPS. All patients had extensive disease (rate of pancolitis or panenteritis: 100%) and were diagnosed within the first 36 months of life (median of 7 months (1, 19)). Eighty per cent required long-term treatment with two or more immunosuppressant agents (table 1).
Targeted gene panel design
Protein coding sequences of 40 genes were selected to design the targeted gene panel (table 2). For 36 genes a comprehensive consensus coding sequence (CCDS)11 was available. An alternative target (marked as * in table 2) was designed for the remaining genes based on all reported protein coding exons of gene transcripts published in Ensembl (European Bioinformatics Institute and Wellcome Trust Sanger Institute, Cambridge, UK).
For PIK3R1 and SLC37A4, the most comprehensive Ensembl gene transcripts ENST00000396611 and ENST00000357590 were selected. For DCLRE1C and NCF4, a target containing multiple transcripts was designed (DCLRE1C: CCDS31149 and first exon of ENST00000378289; NCF4: CCDS13935 and last exon of ENST00000248899 and first exon of ENST00000415063). Exon 1 of the IKBKG gene has not been captured in TGPS. This reflects the difference of one exon in two CCDS sequences, reported for IKBKG (CCDS14757 and CCDS48196).
Exons are illustrated in numerical order following their strand orientation (+/−).
Twenty samples were exome sequenced using a commercial service (Beckman Coulter Genomics), using SureSelect Human All Exon Kit V.4 (Agilent Technologies). Samples were processed according to the SureSelect target Enrichment System for Illumina (see providers protocol) and run on an Illumina HiSeq2000 sequencer. All datasets fulfilled the company's standardised quality criteria. The overall mean coverage across the entire captured exome for all WES samples was 93× with 93% of bps covered ≥10×. Twenty-five TGPS samples were sequenced in-house using the SureSelect XT Custom Capture protocol (Agilent Technologies) on an Illumina MiSeq. Targeted capture baits for 40 VEOIBD genes were designed using the eArray tool (Agilent Biosystems, USA). Baits were designed for 162 kb of sequence, with the use of 5× tiling and with a boosted number of baits for hard-to-capture, guanine–cytosine (GC)-rich regions.
NGS analysis pipeline
The samples were aligned to the human reference genome GRCh37/Hg19 with Burrows–Wheeler Aligner software.12 The alignments were refined (base quality score recalibration, insertion–deletion realignment, duplicate removal) using the Genome Analysis Tool Kit (GATK) suite according to best practices suggested by the Broad Institute. Finally, variants were called using Unified Genotyper (GATK).13
We interrogated the coverage independently of the two applied NGS captures by defining all protein coding target intervals for each gene (most up-to-date and comprehensive CCDS or alternative, *table 2). This approach also highlights deficiencies in bait coverage for these specific regions and results in lower mean coverage of the target genes when compared with the overall mean coverage for WES and TGPS. We applied a 10-read cut-off to account for the increased risk of unreliable heterozygous variant calling in compound heterozygous or autosomal dominant disease models at low depth. We defined deficient exons as those containing one or more bps covered by <10 reads.
DNA for Sanger sequencing was extracted using the Chemagic-STAR (Hamilton, USA) and PCR performed with MegaMix (Microzone, UK) (custom primers sequences available on request; Sigma–Aldrich, UK). PCR products were purified using AmpureXP (Beckman Coulter, UK) and sequenced with BigDye Terminator V.3.1 Cycle Sequencing Kit (Apllied Biosystems, USA) followed by CleanSeq (Beckman Coulter, UK). Sequences were established on the ABI3730XL (Applied Biosystems, USA). Traces were aligned to the reference sequence and variants called using Mutation Surveyor (SoftGenetics, USA).
Statistical analysis was performed with IBM SPSS Statistics for Windows V.22 (Armonk, New York, USA). Categorical variables are expressed as proportions or frequencies and continuous variables as medians and IQRs. Mann–Whitney U test was used to compare medians. For comparison of proportions, Pearson χ2 or Fisher's Exact Test was used where appropriate. All tests were two-tailed and the significance level was set at p=0.05.
Patients were informed and consented for NGS as part of the ‘Patients with Early-onseT Intestinal inflammaTion (PETIT) Study’. The study had ethical approval from the National Research Ethics Service Committee London, Bloomsbury.
Causative mutations established by TGPS
Using TGPS in a prospective setting, likely causative mutations were detected in four out of 22 patients (table 3, sanger sequencing traces: online supplement 1). TGPS of patient 1 revealed a compound heterozygous mutation in the EPCAM gene. Both mutations (missense: p.Cys135Phe and deletion/frame shift: p.Thr234Lysfs*2) were not previously reported and were predicted damaging. Immunohistochemistry revealed the absence of EPCAM in intestinal epithelial cells, and electron microscopy confirmed the diagnosis of tufting enteropathy (data not shown). Patient 2 harboured a novel and predicted damaging homozygous mutation in the IL10RA gene (p.Cys223Arg). Sequencing of patient 3 revealed compound heterozygosity for two mutations in the TTC37 gene. The first variant, a predicted damaging heterozygous missense mutation has not been previously reported (p.Gly673Asp). The second variant resulting in the introduction of a premature stop codon (p.Trp936*), has been published in the homozygous state in a patient with tricho-hepato-enteric syndrome (THES).17 Patient 4 harboured a homozygous splice-site mutation in SKIV2L, a gene also associated with THES. In silico splice-site mutation prediction with Alamut (V.2.3) predicted likely skipping of exon 5 (c.355-2A>C).
Comparison of bp coverage: TGPS versus WES
The median coverage within the protein coding regions of the 40 VEOIBD genes was significantly higher in TGPS samples when compared with WES (271.9 vs 65.6, p<0.001—figure 1). WES resulted in a higher rate of deficient exons (WES: 13.8% (1594/11 520) vs TGPS: 1.3% (182/14 400), p<0.001) as well as a higher rate of exons with complete absence of coverage (WES: 1.9% (215/11 520) vs TGPS: 0.4% (52/14 400), p<0.001).
To evaluate how this trend reflects on individual gene level, we analysed the coverage data of all VEOIBD exons (see online supplement 2) and highlighted the results of six established VEOIBD genes (FOXP3, IL10, IL10RA, IL10RB, XIAP and IKBKG) in figure 2. For all six genes, the rate of coverage of deficient exons was significantly higher in WES when compared with TGPS.
Consequence of coverage instability on variant detection
As a surrogate for bp-position specific reliability and sensitivity we compared common SNP in three samples analysed by TGPS and WES. Eighty-nine SNPs were present in the protein coding target regions of the six samples. All polymorphisms were present in the TGPS samples with the corresponding WES samples failing to confirm six SNPs (in gene ADA, CYBA, HPS1 and NCF1) (table 4).
Children with VEOIBD rarely present with additional disease-specific features such as early onset diabetes mellitus as observed in immunodysregulation, polyendocrinopathy, enteropathy, x-linked (IPEX) syndrome18 or severe perianal disease in IL10 signalling defects.19 ,20 Selective genetic analysis has become common practice in the presence of such features.
Over 40 identified VEOIBD genes and partly overlapping disease phenotypes4 render gene-by-gene sequencing futile and, hence, require a wider screening approach in order to establish a genetic diagnosis. WES studies have revealed novel8 and unexpected genotypes9 and highlighted new pathways involved in the pathophysiology of VEOIBD.
The overall proportion of monogenic disease within VEOIBD is not known. The initial hypothesis that the majority of VEOIBD disease is indeed inherited in monogenic fashion could not be confirmed and more cautious estimations would expect causative mutations in about 20% of all children with VEOIBD. Given VEOIBD is a very heterogeneous condition with often poorly defined phenotype, one might expect a lower molecular diagnostic rate compared with the recently reported 25% in other Mendelian conditions.21 WES has the advantage over a targeted screening approach of interrogating the exome for novel VEOIBD genes. The complexities of interpreting variants of unknown significance exome-wide provide a much greater challenge than within known disease-associated genes. The functional implication of a potentially causative variant in a novel gene has to be established in lengthy follow-on studies. Genetic screening beyond the group of established VEOIBD genes for diagnostic purposes might therefore unlikely have an immediate impact on patient management.
TGPS in VEOIBD
TGPS has already been successfully introduced in the genetic characterisation of conditions such as primary immune deficiencies.22 Our results suggest that TGPS is also an accurate genetic screening tool for children with VEOIBD. TGPS led to consistent coverage and variant detection across VEOIBD genes.
We could show that screening patients for established monogenic VEOIBD diseases reveals rare phenotypical variations: congenital tufting enteropathy (CTE) has been described as an epithelial disorder without evidence of intestinal inflammation.23 As reported previously,24 our results confirm that the presence of chronic inflammatory cells within the lamina propria does not exclude the diagnosis of CTE. Similarly, causative mutations in SKIV2L and TTC37, genes associated with THES, might lack key phenotypical features previously associated with the syndrome17 ,25 suggesting that THES might be one of many possible phenotypes on the spectrum of variable gene penetrance involving the SKIV2L/TTC37 genes.
In the advent of new therapeutic strategies such as HSCT for therapy-refractory VEOIBD, establishing a molecular diagnosis early and accurately to select potentially transplantable monogenic conditions, such as IL10 signalling defects,5 XIAP6 or FOXP3 deficiency,7 is crucial. It is equally important to exclude patients who are unlikely to benefit from such therapy. In children with severe disease who have exhausted all conventional treatment options, HSCT has been considered despite the absence of a genetic diagnosis.26 Even in these cases, ruling out established monogenic VEOIBD conditions such as XIAP deficiency helps the clinician to decide on specific aspects of HSCT conditioning.6
Despite the advances of in silico prediction and protein modelling, it remains a pivotal aim to support genetic results with functional studies whenever possible. Depending on the interrogated condition, there are readily available assays as discussed in our patient with EPCAM deficiency. In others, such as THES, the physician relies on comprehensive phenotyping to be able to recognise or suspect the syndrome. In THES, the exact pathophysiological mechanisms have yet to be established and reliable functional assays are therefore not available.
Considerations when deploying NGS in VEOIBD
The non-specific clinical profile of the majority of children does favour genetic screening by NGS. In these cases, sequential Sanger sequencing of potential candidate genes has been shown to far exceed the expenditure of time and expense that would be required for NGS analysis.9
Designed for optimal overall exome coverage,27 specific considerations have to be made when WES is applied to investigate VEOIBD. Our study highlights that WES might have some limitations in the diagnostic setting due to coverage deficiencies in several VEOIBD genes. This is best exemplified by the coverage of IKBKG (NEMO Deficiency) and NCF1 (Chronic Granulomatous Disease): IKBKG and NCF1 baits have been frequently omitted in commercially available WES captures to avoid non-specific alignment of reads to their pseudo gene loci leading to extensive coverage dropout.28 ,29 As a consequence, WES failed to detect all five SNPs in NCF1. Another reported phenomenon potentially leading to false negative results is the poor coverage of first exons30 in WES data, which we also observed for VEOIBD genes.
Comparing the costs of both NGS technologies applied in this study is difficult: both approaches have been carried out with different standards (outsourcing of WES to a commercial provider vs implementation of TGPS in an accredited diagnostic laboratory). In the context of a diagnostic service, many other factors need to be considered including higher labour costs for WES data analysis, data storage over several years and interpretation and report writing. Additionally, TGPS data excludes the likelihood of incidental findings and their associated ethical dilemmas.31
Our data suggests that early comprehensive genetic screening can have a significant impact on patient management. Excluding or confirming known VEOIBD genotypes should, therefore, be considered early in the disease course in all cases of therapy-refractory VEOIBD.
Novel genetic platforms will facilitate to combine both NGS technologies, which would compensate for the limitations of WES for disease-specific application while offering the opportunity for novel gene discovery in the research setting.
Significance of this study
What is already known on this subject?
A diverse group of monogenic conditions can present as very early onset inflammatory bowel disease (VEOIBD).
In the research setting, whole exome sequencing (WES) has facilitated the discovery of new VEOIBD genes and novel phenotypes of known conditions.
Over 40 VEOIBD genes have been identified so far and are not routinely sequenced in patients with early onset and therapy-refractory disease.
What are the new findings?
Targeted gene panel screening (TGPS) revealed rare and unpredicted phenotypical variations.
TGPS is a reliable genetic screening tool, leading to consistent coverage and variant detection across VEOIBD genes.
In the diagnostic setting, WES performed with some limitations resulting in coverage deficiencies in several VEOIBD genes.
How might it impact on clinical practice in the foreseeable future?
Comprehensive genetic screening in the diagnostic setting will reveal unexpected phenotypes, expand disease characterisation and open up new avenues for disease-specific treatments.
In the advent of novel therapies for VEOIBD, such as allogenic haematopoietic stem cell transplantation, confirming or excluding known VEOIBD genotypes reliably and timely will be an essential requirement.
We thank the IBD and the North East Thames Regional Genetics Team at the Great Ormond Street Hospital for Children for their support in patient recruitment and sample management/preparation. Special thanks to Dr Daniel Kelberman for proofreading the manuscript. This study was performed in partnership with GOSgene based at the UCL Institute of Child Health, and is supported by the National Institute for Health Research Biomedical Research Centre (NIHR BRC). This report is independent research by the NIHR BRC Funding Scheme. The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Contributors JK and CB were responsible for analysing the WES data, SD performed and analysed the TGPS data. CTJ did the bioinformatics on the TGPS and WES raw data. RD performed statistical analysis. HHU and NS advised on the design of the study and the TGPS platform. JK drafted the first version of the manuscript with subsequent critical appraisal from all listed authors.
Competing interests JK is supported by a Great Ormond Street Charity Grant (GOSHCC: V1204). JK, BC, CTJ, LO and PB are supported by the National Institute for Health Research (NIHR) Biomedical Research Centre (BRC) at Great Ormond Street Hospital for Children NHS Foundation Trust (GOSH) and University College London (UCL).
Ethics approval NRES Committee London, Bloomsbury.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.