Background Diamond-Blackfan anaemia (DBA) is an inherited bone marrow failure syndrome (IBMFS) characterised by erythroid hypoplasia. It is associated with congenital anomalies and a high risk of developing specific cancers. DBA is caused predominantly by autosomal dominant pathogenic variants in at least 15 genes affecting ribosomal biogenesis and function. Two X-linked recessive genes have been identified.
Objectives We aim to identify the genetic aetiology of DBA.
Methods Of 87 families with DBA enrolled in an institutional review board-approved cohort study (ClinicalTrials.gov Identifier:NCT00027274), 61 had genetic testing information available. Thirty-five families did not have a known genetic cause and thus underwent comprehensive genomic evaluation with whole exome sequencing, deletion and CNV analyses to identify their disease-associated pathogenic variant. Controls for functional studies were healthy mutation-negative individuals enrolled in the same study.
Results Our analyses uncovered heterozygous pathogenic variants in two previously undescribed genes in two families. One family had a non-synonymous variant (p.K77N) in RPL35; the second family had a non-synonymous variant (p. L51S) in RPL18. Both of these variants result in pre-rRNA processing defects. We identified heterozygous pathogenic variants in previously known DBA genes in 16 of 35 families. Seventeen families who underwent genetic analyses are yet to have a genetic cause of disease identified.
Conclusions Overall, heterozygous pathogenic variants in ribosomal genes were identified in 44 of the 61 families (72%). De novo pathogenic variants were observed in 57% of patients with DBA. Ongoing studies of DBA genomics will be important to understand this complex disorder.
- Diamond-Blackfan anemia
- whole exome sequencing
Statistics from Altmetric.com
Diamond-Blackfan anaemia (DBA; OMIM 105650) is a rare inherited bone marrow failure syndrome (IBMFS) characterised by erythroid hypoplasia. It typically presents with macrocytic anaemia in infancy, often with normal white blood cell and platelet counts.1 DBA is also associated with a spectrum of congenital anomalies, including flat nasal bridge, high arched or cleft palate, short stature, abnormal thumbs, genitourinary defects, cardiac defects and webbed neck.2 3 Patients with DBA are at risk of certain cancers, namely acute myeloid leukaemia, myelodysplastic syndrome, colon adenocarcinoma and osteosarcoma.4 5 Clinical heterogeneity is often present, even within affected members of the same family.6 7
The majority of patients with DBA have autosomal dominant germline mutations in genes affecting ribosomal biogenesis.8 The most commonly mutated gene is ribosomal protein (RP) S19, seen in approximately 25% of patients.9 10 Pathogenic variants in genes encoding components of both the small (RPS24, RPS17, RPS7, RPS10, RPS26, RPS27, RPS28, RPS29) and large (RPL35A, RPL5, RPL11, RPL26, RPL15, RPL27) ribosomal subunits have been described to be causative of DBA.11–17 Mutations in the GATA1 gene, an X-linked haematopoietic transcription factor, cause DBA in a minority of patients.18 More recently, disease-causing mutations were also discovered in another X-linked gene, TSR2, which encodes a binding partner of RPS26. 16
We performed whole-exome sequencing (WES) on 41 patients with DBA from 35 families that lacked a mutation in any of the known DBA genes, and 70 of their unaffected family members. We further used deletion analyses of all RP genes to determine whether any patients with DBA carried a disease-associated deletion.
Affected and unaffected individuals from families with DBA were ascertained through the IRB-approved National Cancer Institute (NCI) IBMFS retrospective/prospective cohort study (www.marrowfailure.cancer.gov, NCI 02 C-0052, ClinicalTrials.gov Identifier: NCT00027274).5 As previously described, this study has been open to accrual since 2001 and participants are enrolled by either by self, physician or family member referral. Referred families undergo review by the study team to determine eligibility prior to being invited to participate and enrol. All DBA study participants completed detailed family and medical history questionnaires, medical records were reviewed in detail and a subset of eligible families underwent clinical evaluations at the NIH Clinical Center. DBA was diagnosed in individuals with macrocytic pure red cell aplasia, and supported by finding increased erythrocyte adenosine deaminase (eADA) when applicable.19 Patients with DBA enrolled prior to 2014 underwent routine clinical mutation testing for the established DBA genes; beginning in 2014, patient with DBA samples were evaluated by WES for mutation identification. Controls for the functional studies were healthy mutation-negative individuals from the IBMFS study.
Exome sequencing and analysis
DNA was extracted from whole blood or buccal cells using standard methods. WES for all DBA families was conducted at the NCI’s Cancer Genomics Research (CGR) Laboratory as previously described (see online supplementary method, supplementary tables S1 and S2 for WES, in silico analysis and variant validation).14 20
OmniExpress chip genotyping
High-throughput, genome-wide SNP genotyping, using Infinium HumanOmniExpress BeadChip technology (Illumina, San Diego, California, USA), was performed at the NCI’s CGR laboratory according to the manufacturer’s guidelines using the Infinium HD Assay automated protocol (see online supplementary methods).21
Array comparative genomic hybridisation
The custom designed array comparative genomic hybridisation (aCGH) (Agilent) consists of 52 797 probes, with a spacing of ~150 bp, designed to target 107 genes of interest in DBA (online Supplementary table S3).22 The coverage of the DBA genes extended ~10 kb upstream and downstream of each gene. Genomic DNA from patients and Agilent control DNA (male) were labelled with different fluorochromes, mixed and hybridised to aCGH chips. Labelling, hybridisation and collection of intensity data were performed at the Microarray Center, University of Iowa, Iowa (http://www.biology.uiowa.edu/ccg/). aCGH data were analysed for CNVs using Nexus Copy Number version 7.5 (BioDiscovery), implementing the default settings. Each sample was visually evaluated for CNV.
CNV detection method for SNP array data
Log R ratio (LRR) and B allele frequency were used to assess CNV. The LRR value is the normalised measure of total signal intensity and provides data on relative copy number. The B allele frequency (BAF) derived from the ratio of allelic probe intensity is the proportion of hybridised sample that carries the B allele as designated by the Illumina infinium sssay. The LRR and BAF values from qualified assays were re-normalised23 and then analysed using custom software pipelines that involved BAF Segmentation packages (http://baseplugins.thep.lu.se/wiki/se.lu.onk.BAFsegmentation) to detect CNVs with a minimum of 20 probes per segment to minimise the false discovery. All potential events were plotted. False-positive calls were excluded from the analysis based on manual review of each plot.
CNV detection method for whole-exome data
The CNV analysis was performed between the test (patient) sample and a reference sample. A log2-based ratio between test and reference sample was calculated based on ngCGH algorithm developed by Sean Davis at the NCI (https://github.com/seandavi/ngCGH) (see online supplementary methods). The Fast Adaptive States Segmentation Technique (FASST2) segmentation method in the Nexus software was used to make CNV calls. The FASST2 segmentation algorithm is a Hidden Markov Model-based approach to detect the possible segment levels that fall between the expected states. A significance threshold of 1× 10−5 was used to adjust the sensitivity of the FASST2 segmentation algorithm with minimum of three probes per segment and a maximum probe spacing of 1000 kb. Copy number (CN) gain and CN loss were defined with log2 ratio value of 0.4 and −0.4, respectively. A high copy gain and copy loss was defined with log2 ratio value of 0.8 and −1.0, respectively.
Mononuclear cells were isolated from peripheral whole blood using density gradient centrifugation with Ficoll. Cells were then activated in culture medium for 96 hours with concanavalin A at a final concentration of 5 ng/µL. The same method was used for the patients and the controls. Preparation of blood samples and northern blot analyses using hybridisation probes to different regions of the 45S polycistronic pre-rRNA transcript were performed as previously described.11 The laboratory personnel were blinded to the clinical and genetic status of analysed samples.
A total of 87 families with DBA were enrolled in the NCI’s longitudinal IBMFS cohort study between 2001 and 2015. These included 106 affected patients and 300 unaffected relatives. The patients in the cohort who were alive at the time of this report were younger than the unaffected relatives with median ages of 9 years (range 0–48 years) and 34 years (range 0–81 years), respectively. The male:female ratio among the affected participants was 1.3:1. Biospecimen and/or genetic testing results were not available on 26 of the 87 families. Twenty-six of the 61 families with clinical genetic testing information (including deletion/duplication testing) had a pathogenic variant in a known DBA-associated gene. The remaining 35 families underwent comprehensive genomic testing studies with WES, deletion and CNV analyses (figure 1) to identify causative genes.
Novel RP pathogenic germline variants
We discovered pathogenic variants in two ribosomal genes previously not associated with DBA, RPL18 and RPL35, each in a different family. There were no variants or deletions in previously DBA-associated ribosomal genes or other ribosomal genes that segregated with disease in these families. The family structures and clinical characteristics are described in figure 2 and table 1.
Family NCI-138 consists of a female proband (II-2) and her daughter (III-1) (figure 2A). The proband presented with anaemia at 2 months of age and had spontaneous remission at age 18 years, with no known relapse. A bone marrow evaluation done at 3 months of age (haemoglobin 7 g/dL) showed characteristic erythroid hypoplasia consistent with DBA. Her daughter had steroid responsive anaemia at age 1 month. At the age of 7 weeks, she underwent a bone marrow aspiration and biopsy for anaemia (haemoglobin 4.1 g/dL, haematocrit 12.1%, reticulocyte count 0.3%, mean corpuscular volume 91.3 fL) that showed marked erythroid hypoplasia consistent with congenital hypoplastic anaemia (DBA). She developed ulcerative colitis (UC) at age 15 years and received treatment with an antimetabolite, which resulted in a drop in her blood counts. She remained red blood cell (RBC) transfusion dependent while on treatment for UC. WES of the proband and daughter revealed a heterozygous variant in RPL35, p.K77N, predicted to be damaging (figure 2C). We evaluated pre-rRNA processing in mononuclear cells from NCI-138 II-2 and III-1 and a healthy individual with wild-type RPL35 to determine if the RPL35 mutation affected ribosome synthesis. The patients with DBA with the p.K77N mutation clearly showed an increase in the 32S relative to wild-type RPL35 consistent with the known effects of several large subunit RPs on human pre-rRNA processing (figure 3A).24 Figure 3A also shows in increase in 41S and 45S pre-rRNAs and a decrease in 30S pre-rRNA in the p.K77N sample relative to control consistent with known effects of the yeast ortholog of Rpl35 on early steps in pre-rRNA processing (figure 3A).25
NCI-172 consists of an affected father (II-1, proband) and his son (III-1) (figure 2B). The proband presented with steroid responsive anaemia at age 8 months. The son presented with mild anaemia at birth that resolved, recurred at 1 year of age and responded to oral steroids. Both the proband and his son had abnormally high levels of eADA, which is specific for the diagnosis of DBA.19 They also have neutropaenia that was diagnosed prior to steroid therapy but are able to mount appropriate neutrophilic responses to infection. The proband underwent a bone marrow aspirate and biopsy at 1 year of age that showed neutrophilic hypoplasia with lymphocytic infiltration, marked granulocytopaenia and maturation arrest in the erythroid series (haemoglobin 6 g/dL, reticulocyte count 1%). The proband’s son (III-1) underwent a bone marrow aspirate and biopsy at 1 year of age that showed severe erythroid hypoplasia and granulocytic hypoplasia with peripheral blood showing neutropaenia and macrocytic anaemia (haemoglobin 6 g/dL, absolute neutrophil count 0.3 × 109/L, reticulocyte count 0.2%). There was no history of diarrhoea or greasy stools in the proband or his affected son. WES did not show variants in SBDS, the gene known to cause 95% of Shwachman-Diamond syndrome, nor were pathogenic variants in any other genes identified. We identified a heterozygous p.L51S predicted deleterious variant in RPL18 in the proband and his affected son (figure 2C). The pre-rRNA processing evaluation in mononuclear cells from the proband compared with wild-type RPL18 showed an increase in the 36S pre-rRNA consistent with the known effects of a subset of large subunit RP defect on pre-rRNA processing (figure 3B).26 36S pre-rRNA is not a normal intermediate in pre-rRNA processing pathways in human cells, so the amounts of 36S pre-rRNA are negligible for controls and for most RP gene mutations,26 27 and it is only detected at a low level with specific RP gene mutations as identified in our patient with an RPL18 mutation.
Mosaic deletion patterns in DBA families
SNP array and/or aCGH analyses were used to identify deletions in the genes of interest, listed in online Supplementary table S3 . We identified germline mosaic deletions present in both buccal and blood cells in two patients from two different families by SNP array. NCI-51–1 was found to have a large 1.8 Mb heterozygous mosaic gene deletion in chromosome 15 that included RPS17 (figure 4A, online Supplementary table 6A ). The other genes included in the deletion are listed in online Supplementary table 6A . The size of this large contiguous deletion and inclusion of multiple genes in this region may explain the proband’s phenotype. She presented with macrocytic anaemia at 6 months of age along with multiple dysmorphic features including short stature, shield chest, webbed neck and scoliosis. Her anaemia was responsive to steroids with only intermittent need for RBC transfusions.
NCI-71–1 had a large 2.5 Mb heterozygous contiguous deletion with mosaic pattern on chromosome 3, inclusive of RPL35A along with a number of other genes (figure 4B, Supplementary 6B ). This may also explain the severity of this patient’s disease. NCI-71–1 presented with anaemia at the age of 1 year and also had developmental delay, short stature and failure to thrive. Her anaemia was responsive to oral steroid treatment. There were no reports of DBA, anaemia, bone marrow failure or cancer in first-degree relatives of either proband NCI-51–1 or NCI-71–1.
Genomic characterisation and frequency of RP mutations in the NCI DBA cohort
We first performed WES on 35 families whose disease-causing gene was unknown at the time of study enrolment. These included 41 affected individuals and 70 unaffected relatives. Sixteen (46%) of these families had mutations in nine ribosomal genes already described as causative of DBA (online Supplementary table S4). This includes our previous report of pathogenic variants in RPS29 as a cause of DBA in two affected families.14
Eight variants (six non-synonymous, one nonsense, one splice site) and five deletions were discovered in known DBA-associated genes (online Supplementary 4A,B ), three of which were found by aCGH (NCI-76, NCI-312, NCI-418). For all the families in which ribosomal gene deletions were identified, the entire coding sequence was found to be deleted. In families where biospecimens were available, additional family members were tested and these variants were confirmed to segregate with disease. In addition, the variants were shown to produce characteristic pre-rRNA processing defects in mononuclear cells derived from patients (online supplementary figure S1A-D), which is consistent with pathogenicity. One DBA proband had a previously undescribed splice site mutation (c.310-2A>G) in RPL15 that we confirmed was pathogenic because it caused the characteristic pre-rRNA processing defect (online supplementary figure 1D ). Another family had a non-synonymous variant (p.T404S) of unknown significance in RPL4; however, in silico programmes predict this variant to be benign. In addition, functional analysis by northern blot did not show that this variant disrupted pre-rRNA processing. With the available data at the time of this report, we do not have sufficient evidence to classify this variant as associated with DBA. Germline mosaic deletion patterns were observed in two families, one each involving RPS17 and RPL35A (online supplemental table 6A,B and described above). The clinical characteristics of the patients with mutations in established genes are described in online supplemental Supplementary table S5. Furthermore, WES, deletion analyses and functional studies of the remaining families led to the discovery of two ribosomal genes (RPL35 and RPL18) with pathogenic variants causative of DBA in four affected patients from two families (figure 2C and described above). Overall, we identified the genetic cause of DBA using WES in 18 of the 35 (51%) mutation-unknown families.
Genetic testing data for DBA-associated pathogenic variants were available for patients from 61 families of the NCI’s DBA cohort, including clinical sequencing done prior to or after enrolment and genomic data on the 35 families who underwent WES and deletion analyses. The frequencies of ribosomal gene mutations in the entire NCI DBA cohort and their specific characteristics are shown in figure 5. Family NCI-29 had been previously published with the deletion in RPS24 that we identified independently.28 We performed functional validation of this deletion by confirming a small subunit pre-rRNA processing defect by northern blot (online supplementary figure 1A).
Overall, pathogenic variants in ribosomal DBA genes were found in 44 of the 61 families (72%), and a total of 71 affected individuals. As expected, RPS19 was the most frequently mutated gene and accounted for 36% of families whose disease-causing gene was known, followed by RPL35A and RPS26 accounting for 14% and 11% each, respectively. No patients were identified with a GATA1 mutation in our cohort. Missense mutations were the most common genetic change observed in 43% of families with known disease mutations. Interestingly, 30% of the variation in disease-causing genes in our cohort was due to a single copy or mosaic gene deletion. Of the 44 families whose gene was identified, we had complete parental testing and inheritance information on 23 (52%) families. Ten of the 23 (43%) had an inherited mutation and 13 (57%) had a de novo change in the causative gene (both parents were negative for the affected child’s disease-associated mutation) (figure 5). At this time, 17 of 61 families tested (28%) do not have a characterised disease-associated mutation.
We used a combination of WES, targeted aCGH, genome-wide genotyping and clinical genetic testing to compressively characterise the genetic causes of DBA in our cohort. This led to the identification of both known and novel ribosomal causes of DBA. Our functional work utilising pre-rRNA processing assays to monitor ribosome assembly showed that germline variants in two RP genes have deleterious effects on protein function. RPL18 encodes the cytoplasmic ribosomal protein L18 that is a part of the L18E family of proteins and a component of the 60S ribosomal subunit. RPL35 encodes ribosomal protein L35, a component of the 60S ribosomal subunit and member of the L29P family of RPs using the newly proposed universal nomenclature for RPs.29 While these discoveries are within the known ribosomal biogenesis pathway causative of DBA, they illustrate the importance of identifying the cause of this complex disease and provide opportunities to advance understanding of the underlying biology.
Novel single copy and mosaic deletions in genes known to cause DBA were identified by aCGH and genome-wide SNP genotyping. Our observation that 30% of patients have a disease-associated deletion is higher than previously estimated value of up to 20% of DBA.18 30 This may be due to increased sensitivity by screening for deletions using two methods that led to the identification of more families with a disease-associated deletion. This may partly explain the observation of RPL35A being the second most commonly mutated gene in our cohort, which is different from prior published studies in the literature. In addition, the demographic characteristics may differ between studies based on design and recruitment, and may account for some of the variation seen in the percentage of DBA attributed to a particular gene. We were also able to detect mosaic deletion patterns. Mosaic deletions involving DBA gene loci have been previously described in two patients11 but the incidence of germline mosaicism in DBA is currently unknown.30 Notably, both of our patients with germline mosaic deletions have severe clinical DBA phenotypes with dysmorphic features in addition to the characteristic RBC aplasia. It is possible that germline mosaicism occurs more frequently than previously described in DBA. With the advancement of next-generation sequencing and molecular diagnostics, it seems likely that additional patients with germline mosaicism will be identified.
Akin to the expectation that about 40%–45% of autosomal dominant DBA-associated mutations are inherited and 55%–60% de novo,15 30 we observed 57% of patients with de novo disease-associated mutations and 43% with inherited mutations in our cohort.
In conclusion, this efficient comprehensive genomic approach was the basis for our discovery of two new causes of DBA, characterisation of ribosomal gene deletions not previously described to be disease-associated and of DBA-associated germline mosaicism. We identified the disease-associated mutations in 51% (18 of 35) of our families without a known genetic cause of DBA. A total of 74% (44 of 61) of our families are now genetically characterised. Considering that pathogenic variants of DBA have been reported in about 55% of patients with DBA,30 our comprehensive approach provided more genomic information than other methods.
An abstract of this study was presented as poster at the 2016 American Society of Hematology Annual Meeting, San Diego, California, USA
Acknowledgements The authors thank all of the study participants for their valuable contributions. Lisa Leathwood, RN, Maureen Risch, RN, and Ann Carr, CGC of Westat, provided excellent study support.
Contributors Project design was carried out by SAS and LM. Analyses were performed by LM. Clinical characterisation and sample collection was performed by PK, NG and BPA. Sequencing and validation was performed by BDH, BZ, MW, KJ, MY, JFB and the NCI DCEG Cancer Genomics Research Laboratory. Cell culture assays were performed by SB. CNV analysis was performed by WZ, FXD and SCC. Pre-rRNA processing was performed by SRE. The manuscript was written by LM, PK and SAS and reviewed and approved by all coauthors.
Funding This study was funded, in part, by the intramural research programme of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health and by contracts N02-CP-91026, N02-CP-11019 and HHSN261200655001C with Westat. SCC
and FXD acknowledge research support from the Intramural Research Program of National Human Genome Research Institute, National Institutes of Health.
Competing interests None declared.
Provenance and peer review Not commissioned; internally peer reviewed.
Collaborators NCI DCEG Cancer Genomics Research Laboratory including Sara Bass, Laurie Burdett, Salma Chowdhury, Michael Cullen, Casey Dagnall, Rebecca Eggebeen, Herbert Higson, Amy A Hutchinson, Sally Larson, Kerrie Lashley, Hyo Jung Lee, Wen Luo, Michael Malasky, Michelle Manning, Jason Mitchell, Adri O’Neil, David Roberson, Shalabh Suman, Aurelie Vogt, and Kathleen Wyatt. NCI DCEG Cancer Sequencing Working Group including Neil E Caporaso, Stephen J Chanock, Mark H Greene, Lynn R Goldin, Alisa M Goldstein, Allan Hildesheim, Nan Hu, Maria Teresa Landi, Jennifer Loud, Phuong L Mai, Mary L McMaster, Lindsay Morton, Dilys Parry, Melissa Rotunno, Douglas R Stewart, Phil Taylor, Geoffrey S Tobias, Margaret A Tucker, Xiaohong R Yang, and Guoqin Yu.
Correction notice This article has been updated since it published online first. Erroneous symbols appearing in Figure 2 have been deleted and Figure 2 legend has been updated to ensure the consistency of reference to NCI-172.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.