Background: Hereditary haemorrhagic telangiectasia (HHT) is a genetic disorder present in 1 in 8000 people and associated with arteriovenous malformations. Genetic testing can identify individuals at risk of developing the disease and is a useful diagnostic tool.
Objective: To present a strategy for mutation detection in families clinically diagnosed with HHT.
Methods: An optimised strategy for detecting mutations that predispose to HHT is presented. The strategy includes quantitative multiplex polymerase chain reaction, sequence analysis, RNA analysis, validation of missense mutations by amino acid conservation analysis for the ENG (endoglin) and ACVRL1 (ALK1) genes, and analysis of an ACVRL1 protein structural model. If no causative ENG or ACVRL1 mutation is found, proband samples are referred for sequence analysis of MADH4 (associated with a combined syndrome of juvenile polyposis and HHT).
Results: Data obtained over the past eight years were summarised and 16 novel mutations described. Mutations were identified in 155 of 194 families with a confirmed clinical diagnosis (80% sensitivity). Of 155 mutations identified, 94 were in ENG (61%), 58 in ACVRL1 (37%), and three in MADH4 (2%).
Conclusions: For most missense variants of ENG and ACVRL1 reported to date, study of amino acid conservation showed good concordance between prediction of altered protein function and disease occurrence. The 39 families (20%) yet to be resolved may carry ENG, ACVRL1, or MADH4 mutations too complex or difficult to detect, or mutations in genes yet to be identified.
- HHT, hereditary haemorrhagic telangiectasia
- JPHT, syndrome of juvenile polyposis and hereditary haemorrhagic telangiectasia
- PAVM, pulmonary arteriovenous malformation
- SIFT, “sorting intolerant from tolerant” program
- vascular disease
Statistics from Altmetric.com
- HHT, hereditary haemorrhagic telangiectasia
- JPHT, syndrome of juvenile polyposis and hereditary haemorrhagic telangiectasia
- PAVM, pulmonary arteriovenous malformation
- SIFT, “sorting intolerant from tolerant” program
Hereditary haemorrhagic telangiectasia (HHT) is an autosomal dominant disorder manifested in 1/8000 individuals worldwide.1 Most affected individuals develop epistaxis before the age of 20.2 However, age of onset, incidence, and severity are highly variable3; individuals may not be diagnosed until a life threatening complication presents. Arteriovenous malformations can occur in the pulmonary, cerebral, and hepatic circulation leading to stroke, internal haemorrhage, and severe anaemia.2–5
Two genes are causally related to HHT. Mutations in the 30 kb endoglin (ENG; OMIM 187300) gene, associated with a high prevalence of pulmonary arteriovenous malformations (PAVMs),6 lead to HHT1.7 Mutations in the 15 kb activin receptor-like kinase-1 gene (ACVRL1; ALK-1, OMIM 600376) lead to HHT2,8 which is characterised by a lower frequency of pulmonary and cerebral arteriovenous malformations than HHT1, but which may have a higher incidence of liver involvement.6,9 As in other diseases with a high new mutation rate,10 most families with HHT have a unique mutation, rendering molecular diagnosis labour intensive. In all, 168 ENG and 138 ACVRL1 mutations of all types have been reported.11–13 Two additional genes have been associated with HHT recently. Mutations in the MADH4 tumour suppressor gene have been associated with a combined syndrome of juvenile polyposis and HHT (JPHT; OMIM 175050).14 An unidentified HHT3 gene linked to chromosome 5 is also likely to account for a subset of HHT patients.15
We present a strategy for mutation detection in families clinically diagnosed with HHT. We document 80% test sensitivity in mutation identification. We report 16 novel mutations and seven new polymorphisms. We evaluate the use of evolutionary conservation analysis to predict the effect of missense variants on protein function and disease association.
Further details on the methods used are found in the online supplemental files.
Samples for 291 HHT families were referred from Canada and several other countries. Informed consent was obtained from each family according to institutional guidelines for clinical testing and research studies. Research procedures were approved by the ethics committee of the Research Institute of the Hospital for Sick Children.
Total genomic DNA was extracted from peripheral blood lymphocytes with the Puregene (Gentra) kit, according to the manufacturer’s instructions.
Quantitative multiplex polymerase chain reaction (QM-PCR), with some modifications to previously reported conditions,16,17 was used to screen for changes in exon size and copy number in all 15 exons of ENG and nine coding exons of ACVRL1. The promoter and exons of ENG, and exons 2 to 10 of ACVRL, were sequenced. Sequencing was often done in duplexes, where two exons were sequenced simultaneously. QM-PCR fragments and sequencing chromatograms were analysed using GeneObjects software (Visible Genetics Inc). Haplotype analysis (by polymorphic markers d12S1677, d12S368, d12S1712, and d12S347) was used to refine analysis for some families.
When no mutation was found, either in ENG or ACVRL1, samples with appropriate consent were referred to the Marchuk laboratory at Duke University for MADH4 sequence analysis. Each MADH4 mutation identified was confirmed in our clinical laboratory.
RNA and protein methods
When required and possible, reverse transcriptase polymerase chain reaction (RT-PCR) analysis was undertaken using a fresh blood sample, gene specific primers, and the inclusion of a puromycin step. Endoglin expression in peripheral blood activated monocytes and umbilical vein endothelial cells was analysed by 35S-methionine labelling and immunoprecipitation using monoclonal antibodies P3D1 and P4A4, as described elsewhere.16–20
Missense variations were analysed to identify disease causing mutations. First, the literature was searched for reports with sufficient evidence to conclude that the variation is causative. Analysis was completed for all other exons of ENG and ACVRL1 to ensure that no other mutation was present. Evolutionary conservation analysis of ENG and ACVRL1 variants was carried out using the SIFT (“sorting intolerant from tolerant”) tool.21–23 To predict the effect of ACVRL1 missense variations on protein structure, several were analysed by molecular modelling, as described previously.24
Summary of mutation analysis
Of 291 families referred for analysis, 24 were excluded because the clinical information was insufficient to confirm a diagnosis, and 73 were excluded because of inadequate DNA. The remaining 194 families were analysed by a combination of QM-PCR, duplex sequencing, long PCR, and RNA sequencing. We identified mutations in 155 of 194 families (table 1), resulting in 80% mutation detection sensitivity. Of all mutations identified, 61% were in ENG (HHT1), 37% in ACVRL1 (HHT2), and 2% in MADH4 (JPHT).
All types of mutations were identified in the ENG and ACVRL1 genes. In our series, missense mutations were associated more often with HHT2 (62%) than with HHT1 (27%), while splice site variants were associated more often with HHT1 (13%) than with HHT2 (2%). Nonsense mutations occurred with equal frequency (15%). Small deletions and insertions were more common in HHT1 than in HHT2 (table 1).
Seven ENG and four ACVRL1 mutations (7% overall) were whole exon deletions or duplications identified by QM-PCR and not by sequencing. This study is the first to report whole exon deletions in the ACVRL1 gene (exons 3 to 8, exon 10, exons 9 and 10). QM-PCR also detected intraexonic deletions and insertions (36% of ENG mutations and 12% of ACVRL1 mutations), which were also detectable by sequencing. Many of the mutations included here for sensitivity analysis have been reported before16–20,24–26 and are not discussed in the text. Table 2 summarises novel mutations and polymorphisms found in our cohort.
Novel ENG mutations
Study of 94 HHT1 families revealed 75 unique mutations; nine previously unreported mutations are described briefly and shown in table 2 with the clinical phenotype of the proband. Supplementary figure 1 illustrates the position of these mutations on the mRNA diagrams. (The supplementary figures can be viewed on the journal website (http://www.jmedgenet.com/supplemental/.)
Three novel mutations affecting ENG splice sites were identified. In family 163, the +5 G to A substitution changed the Shapiro-Senapathy27 splice score from 76.8 to 62.4, predicting missplicing of exon 1. Family history for the proband included a brother and father with PAVMs. In family 520, the −7 C to G substitution in intron 6 activated a cryptic AG acceptor site, shown by RNA analysis to cause an insertion of CATTAG, leading to a premature stop. In family 524, deletion of 186 nucleotides at position 896 of exon 7 removed the 3′ splice site and part of intron 7. This mutation probably causes skipping of exon 7, leading to a frameshift. The proband, whose father had a pulmonary haemorrhage of unknown cause, presented with epistaxis as the only clinical sign.
We identified four novel small ENG insertions leading to frameshift mutations (table 2). In family 508, the proband and mother were each diagnosed with PAVMs. In families 6 and 516, later shown to have a common ancestor, there were three generations of affected individuals.
The proband of family 202 had both pulmonary and cerebral arteriovenous malformations. QM-PCR analysis showed 2.7 copies of exon 4, and 2.9 copies for each of exons 3 and 2, revealing a duplication of ENG exons 2 to 4. Figure 1A illustrates one of the QM-PCR reactions. RNA amplification spanning exons 1 through 6 confirmed the presence of a larger transcript (fig 1B). Sequencing of the transcript confirmed the duplication of exons 2, 3, and 4, leading to a larger protein with in-frame insertion of G23 to Q174. This mutation was also analysed at the protein level (fig 1C). Immunoprecipitation of endoglin from metabolically labelled activated peripheral blood lymphocytes showed that the normal surface glycoprotein (E; 90 kDa monomer) was reduced in the proband sample to an estimated 57 (10)% (mean (SD)) of normal levels. A mutant 110 kDa monomer (band M) was observed, representing a protein with an additional 151 amino acids.
A novel ENG missense mutation was found in family 521: the c.923C→A substitution in exon 7 resulted in a p.A308D conversion. A308 is not highly conserved across species, and this substitution was considered “tolerated” by SIFT. N307 is predicted to be an N-glycosylation site. It is possible that the A308D change could affect the glycosylation of N307 because the negatively charged side chain of aspartic acid could have unfavourable interactions with the negatively charged oligosaccharides. However, the NetNGlyc program (Gupta R, Jung E, Brunak S, in preparation) showed little effect of the A308D substitution (data not shown). Nevertheless, two unaffected family members did not carry this substitution, while three affected members did. The probability that the observed pattern occurs by random chance alone is less than 4%. Furthermore, this variant was not observed in 200 normal alleles sequenced for exon 7, supporting A308D as a disease causing mutation.
Novel ACVRL1 mutations
Within the ACVRL1 gene we have identified a total of 43 unique mutations in 58 resolved families. Six of these mutations are novel and are listed in table 2. Detailed descriptions of some of the mutations are listed below.
In family 534, a splice site mutation (C to G at the −3 position of intron 3), decreased the Shapiro-Senapathy27 splice score from 84 to 72. RT-PCR showed two mutant transcripts: c.314_315ins208 included all of intron 3, and c.314_336del23, caused by a cryptic AG site in exon 4 resulting in deletion of 23 nucleotides (data not shown).
QM-PCR identified the first whole exon deletions to be reported for the ACVRL1 gene. Families 140 and 510 (no known relation) had an identical deletion of exons 3 to 8. Figure 2A shows the QM-PCR of all ACVRL1 coding exons, with a single copy of exons 3 to 8 in the proband. The breakpoints were identified using long PCR with primers spanning the deletion, followed by sequencing, which confirmed an identical direct connection between introns 2 and 8 in the probands of each family (g.5365_9652del4288, fig 2B). Haplotype analysis (fig 2C) suggests that these two families share a common ancestor.
The proband of family 535 had a deletion of ACVRL1 exon 10 and a confirmed diagnosis of HHT. Five family members showed correlation between presence of the deletion and clinical manifestations. The proband of family 544 experienced frequent nosebleeds and had several affected relatives. A deletion of exons 9 and 10 was found. The deletions in both families were confirmed by QM-PCR using several distinct alternate primer sets flanking exons 9 and 10. Additionally, both samples showed a single copy of D12S1677, a short tandem repeat located in intron 9. Both deletions extend 3′ of the ACVRL1 gene and their breakpoints were not identified.
A missense mutation, c.293A→G, was present in the proband of family 542, who had telangiectases and severe liver arteriovenous malformations necessitating a liver transplant. NetNGlyc analysis predicted that the N98S substitution would completely remove a potential N-glycosylation site at N98 and create a new one at residue N96. None of over 300 normal alleles sequenced for exon 3 showed this variant.
In family 514, an A352D substitution changed a highly conserved non-polar hydrophobic residue to a negatively charged hydrophilic one. Testing this mutant on a model of the ALK1 (ACVRL1) protein, based on the three dimensional structure of ALK5,24 suggests that in this variant D352 will form new hydrogen bonds with V353 and A327, with possible steric clashes with I326. As both I326 and A327 are part of the catalytic segment, the substitution is likely to interfere with substrate binding.
Novel polymorphisms in ENG and ACVRL1 genes
We found two additional ENG polymorphic variants, present in up to 2% of the population, and five new ACVRL1 polymorphic variants (table 3). The single base pair substitutions in the coding sequence of ACVRL1 exons 6 and 10 were silent and had a frequency of about 1%. Three intronic polymorphisms, two in intron 5 and one in intron 3, were far from exon boundaries in regions not usually sequenced. The 21-oligonucleotide deletion in intron 5 was found in five of 10 HHT families, and in five of 10 non-HHT control families.
Characterisation of missense variants
We used an evolutionary conservation tool to analyse all missense variants (both mutations and polymorphisms previously reported and those first described in the current study) in order to validate their potential as disease causing mutations. In all, 110 variants (41 ENG, 69 ACVRL1) were investigated using the SIFT tool.22 A partial alignment of proteins similar to human endoglin is shown in supplementary fig 2 (see the journal website: http://www.jmedgenet.com/supplemental) to illustrate the evolutionary conservation and the position of some variants. Table 4 gives a summary of the results. Reliable SIFT predictions (either affecting or tolerated) were available for 101 variants (92%), while predictions for the remaining nine variants were non-informative. Of 102 mutations reported, 82 (20 ENG and 62 ACVRL1) were predicted to affect the protein function, and 12 to be tolerated. For example, in ENG, SIFT correctly predicts L194P to be a causative mutation because L194 is conserved among all known species of endoglin and betaglycan (TGFBR3) proteins. R197Q was predicted to be benign because Q197 was observed in ENG rat and mouse orthologues. Among eight known polymorphisms with informative SIFT analysis, four were predicted to be benign and three to affect protein function (table 4).
There was no informative SIFT prediction available for three substitutions of the initiator methionine of endoglin and for T5M and L8P of the leader peptide because there were no corresponding sequences available for SIFT analysis. However, mutations in the initiation codon lead to null alleles, and are causally related to disease. Variants L107R and V125D were predicted to affect the protein function and variants G331S and C382W to be tolerated; however, their MSC scores were greater than 3.25 and considered to be non-informative SIFT predictions. RT-PCR analysis showed, however, that the mutation associated with G331S (an alteration of the final nucleotide of exon 7) caused exon 7 skipping.
We describe a strategy that combines QM-PCR, bidirectional sequencing, RT-PCR, and missense analysis to identify mutations for 155 of 194 families with a confirmed clinical diagnosis of HHT. We previously showed that genetic testing, if applied in a systematic and optimised programme, renders care more effective and less expensive than clinical management alone.28 Test sensitivity for the cost–benefit study was estimated at 75% before our strategy was clinically implemented. An actual test sensitivity of 80% suggests that cost savings from systematic genetic testing for HHT families may be even greater than previously indicated.
QM-PCR is invaluable in detecting whole exon or multiexon deletions and duplications that cannot be identified by sequencing and represent 7% of ENG mutations and 7% of ACVRL1 mutations found in our series. To our knowledge, this is the first report of whole exon or multiexon deletions in any HHT2 families. QM-PCR analysis is also a more efficient means of identifying intraexonic deletions and insertions than sequencing, because several exons are screened simultaneously. Of the mutations identified, 43% of ENG mutations and 19% of ACVRL1 mutations could be detected by QM-PCR. It is often difficult to distinguish missense mutations from relatively rare polymorphic variants that have not been identified in the general population. We evaluated the use of SIFT analysis to predict variations that affect protein function based on evolutionary conservation. We confirmed that 82 of 102 mutations are predicted to affect protein function. In the case of ACVRL1, there is a very good correlation between putative disease causing mutations and prediction based on evolutionary conservation; SIFT predicted that 62 of 68 missense variants affect the protein function, while the other six are tolerated. In the case of ENG, 20 of 34 missense variants were predicted to affect protein function, while six were tolerated. There were far fewer proteins in the ENG alignment than in ACVRL1. Nine non-informative predictions were observed for ENG; five of these were for variants in the leader peptide, for which very few sequences were available for comparison. For variant G331S, RNA analysis revealed a splice site mutation, which overrides the SIFT prediction. This leaves only three non-informative SIFT predictions for ENG. SIFT predictions are subject to non-trivial false positive and false negative rates and must be used with caution.
Our strategy failed to find mutations for 39 clinically confirmed probands. Some of these patients are likely to have mutations in the putative HHT3 gene, not yet identified but located on chromosome 5.15 A few families might have undetected MADH4 mutations, as this gene was only analysed for a subset of clinically confirmed HHT families for whom no ENG or ACVRL1 mutation was found. New evidence suggests that HHT patients should all be tested for MADH4 mutations if no ENG or ACVRL1 mutation is found, because individuals who carry MADH4 mutations may not present with the classic signs of JPHT (gastrointestinal tract involvement or juvenile polyposis) but as HHT patients.29
We also cannot rule out the possibility that distant mutations, not readily detected by our strategy and perhaps in unidentified regulatory regions, may affect any of the above genes. To confirm the locus involved in disease, linkage analysis requires several consenting individuals from informative families. Such data are not often available. Sequence analysis of the ACVRL1 promoter and exon 1 may increase the detection sensitivity. To date, we have sequenced the ENG promoter for 32 samples without finding any mutations.
It is possible that our cohort includes patients who do not have HHT. Clinical diagnosis of HHT is complicated by large variation in visceral manifestations, even among individuals with the same mutation, and by age dependent manifestations of visible signs. This ambiguity encourages specialists to refer patients for genetic testing who have a suspected diagnosis of HHT, some of whom may not have a genetic predisposition. The number of individuals in the cohort who truly carry a mutation that leads to HHT cannot be determined with certainty; this complicates the calculation of test sensitivity. In our series, no mutation was found for 63 families after complete analysis. For 24 families, clinical diagnosis was deemed to be uncertain, either by the referring physician or because the patient had fewer than three Curaçao criteria.30
A study of HHT patients in the Netherlands reported sensitivity of 90% after sequence analysis alone.31 The difference in sensitivity may result from several factors other than the molecular diagnostic strategy. In the Dutch study, a relatively homogeneous cohort of patients was studied for as long as 30 years by the same team of physicians, using standardised diagnostic criteria. Our cohort of patients was very heterogeneous in terms of location and physicians involved, and we could only eliminate those who did not meet the Curaçao criteria. In addition, founder mutations appeared at a greater rate in the Dutch study than in ours.
Despite these complications, molecular testing for HHT families has several real benefits. Once a familial mutation is identified, relatives at risk can be tested conclusively by one efficient and relatively inexpensive test. Individuals with a mutation are identified for intensive clinical surveillance, while those without a mutation may safely be removed from clinical screening.
ELECTRONIC DATABASE INFORMATION
Electronic URL addresses for the databases and algorithms used in this article are as follows:
HHT Mutation database: http://22.214.171.124/cgi-bin/WebObjects/hht.woa/1/wo/yeJpFl44ZUVy9ScN5xDn90/16.3.11
This research was supported by grants from Heart and Stroke Foundation of Ontario (NA3434), March of Dimes (HHT-FY-02-226), and Canadian Institute of Health Research (POP-62030). SS is supported in part by a CIHR Strategic Training Program Grant – The Samuel Lunenfeld Research Institute Training Program: Applying Genomics to Human Health fellowship.
Published Online First 11 May 2006
Conflicts of interest: none declared
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.