Article Text

Original article
Allele-specific DNA hypomethylation characterises FSHD1 and FSHD2
  1. Patrizia Calandra1,
  2. Isabella Cascino1,
  3. Richard J L F Lemmers2,
  4. Giuliana Galluzzi1,
  5. Emanuela Teveroni1,3,
  6. Mauro Monforte4,
  7. Giorgio Tasca5,
  8. Enzo Ricci4,
  9. Fabiola Moretti1,
  10. Silvère M van der Maarel2,
  11. Giancarlo Deidda1
  1. 1Institute of Cell Biology and Neurobiology, National Research Council of Italy, Monterotondo (Rome), Italy
  2. 2Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
  3. 3Institute of Pathology, Catholic University School of Medicine, Rome, Italy
  4. 4Institute of Neurology, Catholic University School of Medicine, Rome, Italy
  5. 5Don Carlo Gnocchi Onlus Foundation, Milan, Italy.
  1. Correspondence to Dr Giancarlo Deidda, Institute of Cell Biology and Neurobiology, National Research Council of Italy (CNR), Monterotondo Scalo, 00015, (Rome), Italy; giancarlo.deidda{at}cnr.it

Abstract

Background Facioscapulohumeral muscular dystrophy (FSHD) is associated with an epigenetic defect on 4qter. Two clinically indistinguishable forms of FSHD are known, FSHD1 and FSHD2. FSHD1 is caused by contraction of the highly polymorphic D4Z4 macrosatellite repeat array on chromosome 4q35. FSHD2 is caused by pathogenic mutations of the SMCHD1 gene.

Both genetic defects lead to D4Z4 DNA hypomethylation. In the presence of a polymorphic polyadenylation signal (PAS), DNA hypomethylation leads to inappropriate expression of the D4Z4-encoded DUX4 transcription factor in skeletal muscle. Currently, hypomethylation is not diagnostic per se because of the interference of non-pathogenic arrays and the lack of information about the presence of DUX4-PAS.

Methods We investigated, by bisulfite sequencing, the DNA methylation levels of the region distal to the D4Z4 array selectively in PAS-positive alleles.

Results Comparison of FSHD1, FSHD2 and Control subjects showed a highly significant difference of methylation levels in all CpGs tested. Importantly, using a cohort of 112 samples, one of these CpGs (CpG6) is able to discriminate the affected individuals with a sensitivity of 0.95 supporting this assay potential for FSHD diagnosis. Moreover, our study showed a relationship between PAS-specific methylation and severity of the disease.

Conclusions These data point to the CpGs distal to the D4Z4 array as a critical region reflecting multiple factors affecting the epigenetics of FSHD. Additionally, methylation analysis of this region allows the establishment of a rapid and sensitive tool for FSHD diagnosis.

  • Clinical genetics
  • Diagnosis
  • Molecular genetics
  • Muscle disease
  • Diagnostics tests
View Full Text

Statistics from Altmetric.com

Introduction

Facioscapulohumeral muscular dystrophy (FSHD:MIM#158900) is one of the most prevalent neuromuscular disorders with a prevalence of 1:20000 to 1:8000.1–4 Clinical symptoms range from mild to severe with most affected individuals showing signs by age 20 years. Extreme variability is present in symptom presentation even among members of the same family.1 ,5

Two clinically indistinguishable genetic forms of FSHD are known, FSHD1 and FSHD2. The autosomal dominant FSHD1 accounts for approximately 95% of cases. This form results from a contraction of the highly polymorphic D4Z4 macrosatellite repeat array on chromosome 4q35, where each D4Z4 unit is 3.3 kb in size. In the healthy population, this array contains 11–100 units corresponding to 43–340 kb EcoRI fragments. A highly similar D4Z4 array is located on chromosome 10q26, but contractions on this array normally do not result in FSHD. Chromosomes 4-derived and 10-derived D4Z4 arrays can be distinguished by chromosome-specific restriction enzymes.6 ,7 Deletion events that lead to 4-type arrays of 1–10 D4Z4 units are associated with the onset of FSHD1.8 ,9 The D4Z4 unit contains the DUX4 retrogene (DUX4[MIM:606009]) that generates mRNAs stabilised by a polyadenylation signal (PAS) distal to the array.10 ,11 The current model proposes that contraction of the D4Z4 array allows for somatic transcription of the usually repressed DUX4 full-length gene (DUX4-fl), a model further supported by the observation that deletion of entire D4Z4 array does not cause FSHD.11 ,12 The D4Z4 repeats are normally in a relatively closed chromatin configuration, but following the array contraction they adopt a more relaxed configuration. Consequently the reduction of DNA methylation and partial loss of repressive histone modifications leads to illegitimate transcription of DUX4.13 ,14 Similar changes in chromatin structure are observed in patients with FSHD2 characterised by normal sized D4Z4 arrays.15 ,16 In these patients, FSHD2 is most often associated with pathogenic mutations in the structural maintenance of chromosomes flexible hinge-domain containing 1 gene (SMCHD1[MIM: 614982]) on chromosome 18. SMCHD1 encodes a member of the structural maintenance of chromosomes (SMC) protein superfamily involved in chromatin repression of specific genomic regions including the D4Z4 units.4 In FSHD2, the D4Z4 arrays are hypomethylated on 4q and 10q, whereas in FSHD1, hypomethylation is restricted to the contracted alleles.4 ,15–21

Despite D4Z4 hypomethylation being a hallmark of the FSHD disease, only chromatin relaxation on specific 4q haplotypes are associated with FSHD.4 ,22 Specifically, 4q alleles have been classified into four groups, 4qA, 4qA-L, 4qB and 4qC, based on large insertion-deletion polymorphisms distal to the D4Z4 array.23 4qA, 4qA-L and 4qB alleles are common in European and Asian populations, while 4qC alleles are rare. These 4q variants have been further classified, based on a multiallelic polymorphism proximal to the D4Z4 array (simple sequence length polymorphism (SSLP)).22 Only 4qA and 4qA-L variants which contain functional DUX4-PAS, result in the production of stabilised DUX4-fl transcript associated with the disease. Conversely, the absence of DUX4-PAS sequence on chromosome 4 as well as the lack of transcriptional derepression on PAS-positive haplotypes, do not lead to the disease.11

FSHD diagnosis is routinely based on the identification of 4q arrays <11 units on 4qA-type or 4qA-L-type chromosome by Southern blot analysis6–8 (FSHD1). Additionally, methylation analysis is sometimes performed in the proximal D4Z4 units, using methylation-sensitive restriction enzymes (eg, FseI), or throughout all D4Z4 units by bisulfite sequencing assays.16 ,18 ,19 Hypomethylation of all D4Z4 arrays indicates a search for SMCHD1 mutations (FSHD2). However, since these assays show the methylation status of 4q and 10q arrays, they are not diagnostic per se due to lack of information about the presence of DUX4 PAS-positive alleles and their methylation status. For this reason, the FSHD diagnostic flow chart considers hypomethylation analysis as a secondary step to distinguish FSHD1 from FSHD2, the latter being characterised by hypomethylation in all 4q and 10q D4Z4 units.4 ,24

Recently, other groups have studied CpG methylation at D4Z4 by bisulfite sequencing on chromosomes 4 and 10 arrays simultaneously18 or specifically at the most distal unit of FSHD-permissive chromosomes.25 The specificity of the latter method towards the DUX4 region avoids coamplification of the non-permissive arrays on 4qB chromosomes or on chromosome 10. However, the specificity of this assay relies on polymorphisms rather than functional sequences (PAS). Moreover, a different sequence is analysed by bisulfite sequencing (BSS) in 4qA and 4qA-L alleles, hampering their comparison.

In this work, we have addressed the possibility of simplifying FSHD diagnosis through analysis of methylation on PAS-positive alleles specifically. Particularly, we studied the DNA methylation in the region distal to the D4Z4 array (pLAM) containing the DUX4-PAS, thereby capturing two critical hallmarks of FSHD in a single assay.

Methods

Subjects

Genomic DNA samples were provided by the Human Genetics Department, Leiden University Medical Center and Department of Neuroscience Catholic University of Rome. Leiden DNA samples were derived from 44 patients with FSHD1, 17 SMCHD1 mutation carriers and 51 Controls. Italian DNA samples were derived from 34 patients with clinically characterised FSHD, 15 Controls and 4 patients with limb-girdle muscular dystrophy 2A (LGMD2A).

Leiden samples are a subset of a cohort previously published.26 All individuals have signed an informed consent.

DNA methylation analysis

For bisulfite reaction, 200 ng of genomic DNA was converted using the EZ DNA Methylation-Direct Kit (Zymo Research) following manufacturer's instructions. Forty nanograms of converted product were amplified using the AccuPrime Taq DNA polymerase system (Life Technology) according to manufacturer's instructions. PAS-specific PCR was performed in a total volume of 25 μL, using specific oligonucleotides as follows: 2′ at 94°C, 33×(30″ at 94°C, 30″ at 54°C, 1′ at 68°C). In the same reaction, tp53 was coamplified as a DNA control. The resulting PCR products were then purified using MinElute PCR purification kit (QIAGEN) and quantified for subsequent sequencing (see below). Amplification of the B allele, was performed as follows: 2′ at 94°C, 31×(30″ at 94C°, 30″ at 58C°, 1′ at 68C°). Primers were designed with the MethPrimer software and are shown in (see online supplementary table S1).27

Sequencing and CpG methylation analysis

Purified sample was sequenced with FasPAS primer using BigDye® Terminator v3.1 Cycle Sequencing Kit The samples were run using an ABI Prism 3100 (Applied Biosystems) capillary electrophoresis-based genetic analyser and analysed with the ABI PRISM DNA sequencing software. To avoid sequence degradations, at the end of the run, 0.2 μl of GeneScan-120 LIZ Size Standard was added in the same reaction well and the samples were run again with the AFLP standard protocol. Data were then analysed with AFLP-specific analysis module in Gene Mapper software 4.0. The Cs and Ts peak areas were selected and the percentage of cytosine methylation (Cmet) determined.

Standard control for accuracy of methylation assay

Accuracy of methylation analysis was evaluated by performing PCR on synthetic templates. Predicted products of bisulfite-treated fully methylated and unmethylated templates were synthesised and cloned in pUC57 (PRIMM). For calibration curves plasmids were linearised with NdeI, and used at different ratios as templates for the PAS-specific PCR.

Statistical analyses

All statistics were performed by Analyse-it, Method validation Edition Software (http://analyse-it.com/). Group comparisons were performed by Kruskal-Wallis test with Bonferroni correction. Correlation analyses were performed by Pearson's correlation test.

Results

Design of DUX4 PAS-specific bisulfite PCR

Molecular diagnosis of FSHD requires the characterisation of multiple genetic features. The presence of a DUX4 PAS-positive allele is a prerequisite for the development of the disease (figure 1). We therefore designed a PCR that selectively amplifies DUX4 PAS-positive alleles from sodium bisulfite-treated DNA. During the bisulfite treatment, unmethylated cytosines are converted into uracils; consequently the ‘sense’ and ‘antisense’ strands become non-complementary, thus necessitating the amplification of each strand separately. In order to design a bisulfite-PCR specific for the 4qA-PAS, we were forced to use the antisense bisulfite converted strand, since the conversion of C to U hampers discrimination of 4qA and 10qA sequences on the sense strand (figure 2). The forward primer is specific for the DUX4-PAS sequence on 4qA (ATTAAA) and discriminates all homologous non-functional 10qA sequences: ATCAAA on the 10qA166 variant and ATTTAA on the infrequent 10qA176T variant. The reverse primer (RevAS) was designed in a region immediately distal to D4Z4 common to 4qA/4qA-L variants on 4q and 10q (figure 2).11 ,23 This design allows the amplification of all DUX4 PAS-positive alleles producing a 235 bp long PCR product. The amplified product contains 10 CpGs and 8 SNPs, 5 of which (position 7903, 7946, 7968, 7987, 7993 in ref. seq. FJ439133,11 see online supplementary table S2) are specific for chromosome 4 and so can be used as Control to further confirm the reaction specificity.

Figure 1

Overview of the genetic and epigenetic status at D4Z4 in Control, FSHD1 and FSHD2 subjects. (A) Open triangles represent D4Z4 arrays (11–100 units), which are hypermethylated (black dots reflect the average methylation across the whole array) in Control subjects. The D4Z4 macrosatellite length, the presence of PAS sequence and methylation status in FSHD1 and FSHD2 are reported. In FSHD1 the pathogenic mechanism consists of D4Z4 array contraction while FSHD2 is caused by mutations in SMCHD1. D4Z4 arrays have been classified on the basis of the presence of a PAS sequence. (B) PAS-positive 4qA, 4qA-L and 4qC arrays are shown. The box indicates SSLP allelic variants associated with each array: 159, 161, 166, 166H and 168 alleles associated with 4qA; 161 associated with 4qA-L and 166H with 4qC. Chromatin relaxation due to the loss of CpG methylation leading to DUX4 gene expression has been demonstrated for 4qA and 4qA-L. (C) PAS-negative 4qB (open triangles), 10qA and 10qB (black triangles) arrays are shown. The box indicates SSLP alleles associated with these arrays: 163 and 168 associated with 4qB; 166 and 176T with 10qA and 161T with 10qB. FSHD, facioscapulohumeral muscular dystrophy; PAS, polyadenylation signal.

Figure 2

DUX4 PAS-specific bisulfite PCR. (A) Sequence of the antisense bisulfite converted region distal to the D4Z4 array, used as template for PAS-specific PCR. All bisulfite converted non-methylated C's in the template are underscored. The PAS sequence is highlighted. The primers used are marked with arrows. In the forward primer (FasPAS), the bold letters indicate the differences between 4qA and 10qA homologues; the empty square indicates the difference with 10qA176T allele. R indicates A/G nucleotide in the reverse primer (RevAS). Bold italic numbers (1 to 10) indicate the CpG dinucleotide analysed. The grey square indicates a common polymorphism at CpG8. (B) The two distal D4Z4 repeats are represented for the PAS-positive 4qA and 4qA-L arrays. DUX4 exons are numbered. Arrows indicate PAS-specific PCR primers that amplify the region containing 10 CpGs. (C) Bisulfite PAS-specific PCR products obtained from genomic DNA samples with different chromosomes 4 and 10 combinations (indicated at the bottom of the figure). PCR from 12 samples are shown. Subjects b812, 251 and 533 carry different combinations of the PAS-negative alleles: 4qB163, 4qB168, 10qA166 and 10qA176T, (see online supplementary table S1) giving negative results in the PAS-specific PCR. The top panel shows the PAS-specific and tp53 PCR products. Tp53 was used as DNA control; the bottom panel shows B-telomere specific PCR products. PAS, polyadenylation signal.

Methylation analysis in FSHD individuals and Controls

We analysed the D4Z4 methylation of the source DNA by direct sequencing, avoiding cloning procedures that may introduce sampling errors. The specificity of the bisulfite sequencing reaction was confirmed on cloned 4qA and 10qA sequences (data not shown).6 Next, we analysed a panel of 112 previously fully characterised genomic DNA samples (from Leiden University Medical Center LUMC) presenting different allelic combinations.26 This panel is derived from a cohort of Caucasian subjects including 51 Controls, 44 patients with FSHD1 and 17 patients with FSHD2 (SMCHD1 mutation carriers). Two SMCHD1 carriers have the 4qB/4qB genotype and therefore are not affected (Samples 507 and 533). Three SMCHD1 carriers are asymptomatic carriers with age corrected severity score (ACSS)=0 (Samples 471, 480, 482). Similarly, two patients with FSHD1 are asymptomatic carriers with ACSS=0 (Samples 192, 197) (see online supplementary table S3).

PCR on bisulfite-treated DNA samples produced positive results in 99 out of 112 samples, all characterised by the presence of at least one PAS-positive allele: 4qA, 4qA-L and 4qC. Negative PAS-specific PCR in 13 out of 112 samples, carrying different combinations of 4q and 10q DUX4 PAS-negative alleles, demonstrates the specificity of the PCR reaction (figure 2). In the same reaction mixture, primers for tp53 were added as DNA input control. All samples were further characterised by B allele-specific PCR. All PAS-negative samples turned out to be positive for B-specific PCR (figure 2C). Because of the possible presence of B ‘telomere’ on chromosome 10 (10B161 T), the positivity to this PCR does not indicate 4qA/4qB heterozygosity.23 Bisulfite products were further sequenced to evaluate the quality of the reaction. Sequencing confirmed the specificity of PCR by showing the absence of amplification of non-4qA loci (see online supplementary figure S1).

To obtain a quantitative evaluation of methylation of all 10 CpGs, the sequencing reactions were rerun after the addition of a size-standard while peak areas were analysed (see online supplementary figure S1 and table S4). It is noteworthy that methylation levels of all CpGs are significantly reduced in FSHD1 and FSHD2 compared with Controls.

Methylation levels of CpG4 and CpG10 are also significantly different between FSHD1 and FSHD2 (figure 3). Despite these differences, the methylation values of FSHD1 and FSHD2 are largely overlapping hampering the use of CpG4 and CpG10 to distinguish the FSHD types.

Figure 3

Methylation levels of all analysed CpGs in samples from Leiden University Medical Center (LUMC). Percentage of methylation of indicated CpGs in Control (n=40, black bar), FSHD1 (n=44, light grey bar) and FSHD2 (n=15, dark grey bar) subjects are reported. Bar reports the mean value plus SD. Mean reports the average value of all 10 CpGs examined. Methylation levels have been compared by Kruskal-Wallis test with Bonferroni correction. *p<0.05; **p<0.001; ***p<0.0001. Only significant values have been reported. FSHD, facioscapulohumeral muscular dystrophy.

Analysis of methylation levels of the single CpGs and of the mean of all CpGs revealed that CpG6 is the most informative, showing the greatest difference between Control and FSHD samples (figures 3 and 4A).

Figure 4

Comparison of CpG6 and FseI methylation levels. Non-parametrical (notched boxplots) and parametrical (diamond-shapes) analyses of CpG6 (A) and FseI (B) in 40 Controls, and 44 FSHD1 and 15 FSHD2 samples. Comparison of CpG6 methylation levels between 4qA/4qB (C) and 4qA/4qA (D) samples. Non-parametrical and parametrical analyses of CpG6 in Controls, FSHD1 and FSHD2 samples. The boxplots extend from the 25th to the 75th centiles, and the whiskers are at the 2.5th and 97.5th centiles. Diamond shapes represent the mean 95% CI. FSHD, facioscapulohumeral muscular dystrophy.

CpG2 and CpG8 too showed a good discrimination between affected and Control subjects. However, their analysis is not always possible. In fact, in some samples, CpG2 is unquantifiable because of peak selection difficulties whereas in other cases CpG8 is absent because of the presence of a C>T polymorphism at position 7886 (GenBank FJ439133).

The ability of CpG6 to distinguish affected subjects from Control subjects is also related to its different distribution in the two groups (figure 4A and online supplementary figure S2). Specifically, CpG6 methylation levels are distributed in a narrow range in Controls whereas they are widely distributed in FSHD. This is in agreement with the variability in the size of the D4Z4 array, in the clinical severity, and in the number of amplified 4qA alleles (A/A and A/B genotype). Accordingly, CpG6 levels show a different distribution between 4qA/4qB and 4qA/4qA FSHD1 subjects due to the contribution of one or two permissive alleles, respectively (figures 4C, D). In fact, the mean CpG6 value in 4qA/4qA is the average between the mean value of Controls and that of 4qA/4qB (compare figures 4D, C). Conversely, mean values of FSHD2 4qA/4qA and 4qA/4qB show no significant difference. Importantly, independently of the FSHD type, the affected individuals are characterised by reduced methylation in comparison to Controls. Inspection of outliers and borderline values reveals a minimal overlap in the close proximity of threshold value (72.68%) among FSHD subjects and Controls. Interestingly, one out of three FSHD samples falling in the Control range do not show clinical symptoms (ACSS=0) whereas two out of five Control samples falling in the affected range are carriers of short arrays on the ‘non-pathogenic’ 4qA166 haplotype (see online supplementary figure S2).

These data indicate that CpG6 may have a diagnostic potential in FSHD. In comparison, the analysis by methylation-sensitive FseI endonuclease, which averages the D4Z4 methylation levels on chromosomes 4 and 10, shows a wide overlap between Controls and FSHD1 (figure 4B).4 However, FseI efficiently identifies FSHD2 individuals, which are clustered in the lower part of the graph with a methylation level<25%.14 ,26 ,28 Therefore, the combination of PAS-specific methylation test and other D4Z4 methylation assays (eg, FseI) may provide the identification of all three classes of individuals.15 ,17–19 ,21

To assess the potential use of CpG6 analysis for FSHD diagnosis, we performed a receiver operator characteristic (ROC) curve by comparing methylation levels of CpG6 in 40 Controls and 59 affected subjects (FSHD1+FSHD2) (figure 5A). The analysis showed the ability of this assay to detect affected individuals with sensitivity of 0.95 and specificity of 0.87 at the cut-off of 72.68% (area under the curve, 0.96, 95% CI 0.92 to 1.00, p<0.0001), strongly supporting the usefulness of this assay for FSHD diagnosis.

Figure 5

(A) ROC curve analysis of CpG6 methylation levels. ROC of FSHD1+FSHD2 (positive cases, n=59) versus Controls (negative cases, n=40). The area under the curve measures the ability of the diagnostic test to correctly identify the status of analysed subjects, area=0.96, 95% CI 0.92 to 1.00, p<0.0001. The resulting best cut-off value is 72.68% methylation. (B) Correlation between CpG6 methylation and repeat array size. Scatter plot shows correlation between the methylation level on 4qA chromosome in 4qA/4qB individuals (22 FSHD1+26 CTRL and 10 FSHD2) and the log2 of the array size (units). Logarithmic transformation of the repeat units shows correlation with the FSHD1+CTRL (Pearson's p<0.0001, r=0.82) and FSHD2 groups (Pearson's p=0.0002, r=0.91). (C) Correlation between CpG6 methylation and ACSS. Regression analysis of CpG6 methylation levels with ACSS in 4qA/4qB subjects, FSHD1+FSHD2 (n=31, r=−0.46, p=0.0089). The values have been calculated with a CI of 95%. ACSS, age corrected severity score; CTRL, Controls; FSHD, facioscapulohumeral muscular dystrophy; ROC, receiver operator characteristic.

To further confirm these data and avoid potential bias in sample selection, the same methylation analysis was performed on 34 Italian patients clinically diagnosed with FSHD and on 15 unrelated Controls from the Catholic University of Rome's Department of Neuroscience (see online supplementary table S5). In this group, 13 Controls turned out to be PAS-positive whereas, as expected, all patients with FSHD were PAS-positive, confirming the role of this sequence in the disease onset.

All CpG methylation levels differ significantly between Controls and FSHD subjects in this group also (see online supplementary figure S3), with CpG6 being again the most informative (p<0.0001). Comparison of methylation levels between the LUMC and Italian samples revealed no significant differences between affected subjects or Controls of the two groups (see online supplementary figure S4). Accordingly, combination of both cohorts did not substantially modify the ROC curve analysis (area 0.96, 95% CI 0.93 to 0.99, p<0.0001, sensitivity 0.91, specificity 0.90 at cut-off threshold 72.68%), confirming the effectiveness of CpG6 analysis for FSHD diagnosis. To further analyse this assay's specificity, we tested the methylation levels of these CpGs in four LGMD2A subjects: all fall in the Control range (78%, 74.3%, 79.6%, 76.7%).

Assessment of accuracy and intra-assay precision of methylation assay

Since methylation values result from the evaluation of fluorescence of two different dye terminators, the accuracy of CpG6 methylation analysis was ascertained by performing PCR on synthetic templates which replicate the two opposite conditions: a fully methylated allele treated with the bisulfite reagent and an allele with all unmethylated CpGs. Analysis of mixtures of the two templates at different ratios showed a high correlation between observed and expected values for CpG6 (see online supplementary figure S5) and all other CpGs (data not shown), indicating that the observed methylation levels are in agreement with the actual values.

We further evaluated intra-assay precision by performing the entire methylation assay, (ie, bisulfite conversion and quantitative sequence analysis) on DNA samples of 10 subjects in triplicate. This analysis showed a standard deviation <10% of the mean (see online supplementary figure S6), indicating the good performance of this assay.

Correlation between CpG6 methylation, D4Z4 array size and age-corrected clinical severity score.

D4Z4 contraction is associated with demethylation of the 4q locus in FSHD1. Accordingly, D4Z4 methylation correlates with the array size.26 To ascertain whether hypomethylation at the 3′ of the array is similarly affected by array contraction, the correlation between the D4Z4 repeat array length (in units) and CpG6 methylation levels was evaluated. We took advantage of the possibility to measure the methylation levels selectively on single alleles (4qA-PAS) in 22 FSHD1, 10 FSHD2 and 26 Control subjects with 4qA/4qB genotype. A highly significant correlation was observed between CpG6 methylation levels and the logarithmic transformed (base2) repeat unit number (Pearson's p<0.0001, r=0.71, r2=0.50, n=58).26 Separate analysis of samples, characterised by size-dependent methylation (FSHD1 and Controls) and SMCHD1-dependent methylation (FSHD2), showed even higher correlation compared with the overall analysis as indicated by r values (FSHD1+Controls, Pearson's p<0.0001, r=0.84, r2=0.70, n=48; FSHD2, Pearson's p=0.0002, r=0.91, r2=0.83, n=10) (figure 5B). These results strongly confirm the effect of the unit’s number on CpG6 methylation levels. Interestingly, these data point out the additional influence/weight of SMCHD1 mutations. In fact, FSHD2 subjects show reduced methylation levels in comparison to non-SMCHD1 individuals carrying an identical array (average reduction 28%—see FSHD2 vs Controls in figure 4C).

Since the fragment size is roughly related to the severity of the disease, CpG6 methylation was further correlated to the ACSS.28 Of note, a linear correlation was observed in FSHD1+FSHD2 with 4qA/4qB genotype (n=31, r=−0.46, p=0.0089) (figure 5C). Conversely, in 4qA/4qA subjects, analysis of CpG6 versus D4Z4 units or ACSS showed no significant correlations, as expected because of the contribution of the second non-pathogenic allele (PAS-positive but hypermethylated).

4qA166 haplotypes methylation analysis

PAS-positive haplotypes have been described as permissive or non-permissive and can be distinguished on the basis of the polymorphic SSLP proximal to the D4Z4 array. To date, the PAS-positive 166 variant on 4qA haplotype has been described as non-permissive,22 because of its association to short arrays in healthy subjects. Accordingly, in this study four subjects carrying the 4qA166 short arrays (256, 302, 303, 393) have been included in the Control group. Interestingly, CpG6 methylation values of these subjects were very close or within the normal range (68.8%, 70.2%, 73.4% and 72.8%, respectively), indicating that this haplotype is hypermethylated despite the allele contraction. This data further confirms the ability of this assay to correctly identify the clinical status.

Discussion

DNA hypomethylation at the D4Z4 array on 4q is a hallmark of FSHD. Here, we studied the methylation levels of CpGs in the immediate proximity of the DUX4 polyadenylation sequence. All the analysed CpGs are differentially methylated between Controls and FSHD subjects. At present, DNA methylation status of FSHD individuals is analysed by studying different regions within the D4Z4 units.17–19 ,25 ,26 These analyses measure global methylation levels at D4Z4 arrays on chromosomes 4 and 10 allowing the differentiation between FSHD1 and FSHD2. However, they are generally unable to distinguish the pathogenic array (4q) methylation levels from the non-pathogenic ones (10q). Moreover, additional factors affect the results: the presence of non-pathogenic arrays on chromosome 4 (eg, 4qB), the wide variability of number of units composing each array and ultimately the possible presence of mutations in modifier genes (eg, SMCHD1). Therefore, these methods require the full knowledge of the length of 4q and 10q arrays, and the mutation type affecting SMCHD1 in order to infer methylation levels of each allele.26 Taking into account all these factors, ‘methylation corrected values’ correlate with the array length and clinical variability, suggesting the potential usefulness of methylation data as markers of the disease.26 Recently, Jones et al17 ,25 introduced a bisulfite-based methylation analysis to study the most distal D4Z4 unit on 4qA and 4qA-L alleles. Their study, while being specific for FSHD alleles, selected the permissive alleles on the basis of polymorphisms associated with specific SSLP alleles proximal to the D4Z4 array instead of functional features (PAS). Moreover, their approach requires nested PCR, multiple clone sequencing, and different primer pairs amplifying different CpGs in 4qA and 4qA-L variants. Therefore, using this approach, it can be difficult to compare methylation in subjects carrying different 4qA variants.

In our study, we designed an assay that allows direct measurement of the methylation on functional alleles, avoiding the interference of non-functional ones. The setup of a PCR reaction that amplifies only the PAS-positive alleles and the CpGs close to the distal copy of the functional DUX4 (DUX4-fl) restricts the methylation analysis to the region responsible for the disease. In particular, in the patients carrying 4qA/4qB genotype (approximately 50% of FSHD individuals) this analysis allows the direct measurement of methylation at the 4qA FSHD allele.

Using this approach, we detected highly significant differences between affected and healthy subjects, confirming the D4Z4 hypomethylation in the region distal to the array and proving the usefulness of this region in the study of the disease.

Analysis of a fully characterised cohort consisting of 40 Controls and 59 FSHD subjects showed that CpG6 is able to correctly identify 95% of FSHD subjects. These data suggest measuring methylation of CpG6 could be a diagnostic tool for FSHD, improving the present diagnostic procedure. Unexpectedly, the mean methylation value of all CpGs is less informative than sole CpG6, suggesting a possible functional role for this CpG.

Interestingly, our approach revealed methylation levels close to or within the normal range for three subjects carrying the short D4Z4 array on the non-permissive 4qA166 alleles. These data, while confirming the sensitivity of this approach, suggest an intrinsic resistance of this haplotype to demethylation, possibly explaining its lack of pathogenicity.

We were able to assess the methylation levels at the single PAS-permissive allele in individuals characterised by the 4qA/4qB genotype (approximately 50% of the general population). Here, CpG6 methylation showed a strong correlation with the logarithmic transformation of the D4Z4 array size for Controls, FSHD1 and FSHD2 individuals (figure 5). These results indicate that methylation in the regions within and distal to D4Z4 are influenced by the array size. Moreover, FSHD2 analysis clearly shows the additive effect of the repeat array size and of the SMCHD1 mutations on methylation levels. Importantly, we observed direct correlation with ACSS, despite the limited number of subjects analysed and the intrinsic relative inaccuracy of severity scoring by clinical severity score (CSS). These correlations may make it possible to develop a predictive test for FSHD severity. A larger-scale analysis will allow us to evaluate the reliability of this test.

In 4qA/4qA individuals, CpG6 methylation levels do not show linear correlation with ACSS, probably because of the interference of the second 4qA long allele. In fact methylation levels of these individuals correspond to the mean value between Controls and FSHD1 4qA/4qB subjects (figure 4). Noteworthy, the absence of CpG6 linear correlation with D4Z4 array size and ACSS in these samples does not invalidate the diagnostic value of CpG6 since their methylation values still fall in the range of affected subjects.

In conclusion, we have set up a new approach to analyse methylation of the 3′-untranslated region (UTR) region of the DUX4 gene based on direct bisulfite sequencing. Considering CpG6 methylation levels, approximately <73% as threshold value for FSHD, we designed a hypothetical diagram that places methylation analysis at the root of the FSHD diagnostic algorithm allowing the identification of FSHD1 and FSHD2 subjects (figure 6). These subjects will be further characterised by D4Z4 array sizing, currently the most reliable test to distinguish FSHD1 and FSHD2. Other D4Z4 methylation assays (eg, FseI, DR1), which efficiently detect subjects carrying SMCHD1 mutations, may further simplify the identification of FSHD2 subjects (figure 6). Moreover, the absence of the PAS-specific product in subjects showing FSHD-like phenotype would indicate the need to search for other FSHD-like disorders (eg, LGMD2A).

Figure 6

Flow chart of epigenetic analysis by PAS-specific PCR. Negative PAS-specific PCR leads to the identification of not-FSHD 4qB/4qB subjects. Positive PAS-specific PCR and subsequent analysis of methylation levels of potentially pathogenic alleles (4qA, 4qA-L), leads to the identification of FSHD (CpG6 methylation levels <73%) or not-FSHD (CpG6 methylation levels >73%, based on the analysis of 146 PAS-positive subjects). A further analysis is required to discriminate FSHD1 from FSHD2 (eg, by the assessment of D4Z4 unit number on 4q and/or D4Z4 methylation analysis). FSHD, facioscapulohumeral muscular dystrophy; PAS, polyadenylation signal.

In conclusion, methylation levels of PAS-linked CpGs summarise the effect of different genetic factors responsible for epigenetic changes at D4Z4 on 4q35 pointing to the array 3′-end as a key region to characterise derepressive events associated with the onset and clinical variability of FSHD.

Acknowledgments

The authors thank Mr Pierluigi Palozzo for technical assistance.

References

View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • SMvdM and GD contributed equally.

  • Contributors All authors of this manuscript have directly participated in the execution of the study. PC and IC performed experiments and contributed to writing the manuscript. RJLFL and SMvdM contributed the LUMC sample and contributed to writing the manuscript. ET and GG participated in the interpretation of analyses of methylation data and contributed to writing the manuscript. ER, MM and GT clinically evaluated subjects and collected clinical samples. FM performed statistical analysis and did critical revision for important intellectual content. SMvdM and GD conceived and designed the study and wrote the manuscript. All authors read and approved the final manuscript.

  • Funding This work was financially supported by FSH Society Grant FSHS-82014-01, Spieren voor Spieren and FSHD Italia-Onlus Association.

  • Competing interests None declared.

  • Ethics approval Research Ethics and Bioethics Advisory Board of National Research Council of Italy (CNR).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.