C9orf72 repeat expansions is a major cause of familial frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) worldwide. Sizes of <20 hexanucleotide repeats are observed in controls, while up to thousands associate with disease. Intermediate C9orf72 repeat lengths, however, remain uncertain. We systematically reviewed the role of intermediate C9orf72 alleles in C9orf72-related neurological disorders. We identified 49 studies with adequate available data on normal or intermediate C9orf72 repeat length, involving subjects with FTD, ALS, Parkinson’s disease (PD), atypical parkinsonism, Alzheimer’s disease (AD) and other aetiologies. We found that, overall, normal or intermediate C9orf72 repeat lengths are not associated with higher disease risk across these disorders, but intermediate allele sizes appear to associate more frequently with neuropsychiatric phenotypes. Intermediate sizes were detected in subjects with personal or family history of FTD and/or psychiatric illness, parkinsonism complicated by psychosis and rarely in psychiatric cohorts. Length of the hexanucleotide repeat may be influenced by ethnicity (with Asian controls displaying shorter normal repeat lengths compared with Caucasians) and underlying haplotype, with more patients and controls carrying the ‘risk’ haplotype rs3849942 displaying intermediate alleles. There is some evidence that intermediate alleles display increased methylation levels and affect normal transcriptional activity of the C9orf72 promoter, but the ‘critical’ repeat size required for initiation of neurodegeneration remains unknown and requires further study. In common neurological diseases, intermediate C9orf72 repeats do not influence disease risk but may associate with higher frequency of neuropsychiatric symptoms. This has important clinical relevance as intermediate carriers pose a challenge for genetic counselling.
- intermediate alleles
- clinical genetics
- repeat expansions
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
The hexanucleotide repeat GGGGCC expansion within the first intron of the C9orf72 gene on chromosome 9p21 has been associated with amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD),1 2 accounting for ~40% of familial ALS, 30% of familial FTD and ~8% of sporadic ALS cases in predominantly Caucasian populations.3 Most healthy individuals carry up to 25 repeats, with more than half carrying only two repeats, while pathogenic expansions of hundreds2 to several thousands have been reported in FTD and ALS.1 4 Other repeat expansion disorders have explored the potential effect intermediate alleles have on conferring disease: in Huntington’s disease (HD), intermediate CAG lengths overlapped with patients with HD on behavioural measures5 and pathological evidence of HD was found in patients with intermediate (range 27–35) CAG repeats.5 Spinocerebellar ataxia 2 patients with apparently sporadic disease have shown intermediate CAG repeats of 22–35 units in the ATXN2 locus.6
In C9orf72-related disease, a repeat length of >30 has typically been defined as pathogenic,2 mainly due to technical limitations of the repeat-primed PCR (rp-PCR) technique. While cell and Drosophila models indicate that expression of >30 repeats are sufficient to cause neurodegeneration,7 it remains unknown how many repeats are truly needed to cause disease. Small repeat sizes of 20–22 were reportedly pathogenic in FTD,8 and the clinical phenotype of ALS cases with 20–30 repeats appear similar to those with >30 repeats.9 Given the conflicting evidence concerning pathogenicity of intermediate repeat numbers, we sought to summarise the current literature and provide insights into the potential role intermediate C9orf72 repeats may play (if any) in modulating clinical disease.
Review of the current literature
We conducted a PubMed and Google Scholar search for all relevant full-text studies and articles using search terms including ‘C9orf72’, ‘intermediate alleles’ and ‘repeat size’ in neurological diseases. Only English articles with available data (including online supplementary material) on C9orf72 allele lengths in normal and intermediate repeat ranges were included. Methods used by each study for repeat-length analysis (Southern blotting, rp-PCR and fluorescent fragment length analysis) are listed in the online supplementary table files.
Intermediate C9orf72 repeats do not appear to influence disease risk or clinical characteristics in non-expansion cases and controls
We identified 49 studies on intermediate alleles in neurodegenerative diseases, including 13 studies with FTD-spectrum cases, and 16 studies with ALS cases. Eight studies found no significant difference in mean repeat lengths between expansion-negative FTD cases and controls.10–17 Sporadic FTD cases appear to show similar allele frequency distribution as controls, with the two-unit repeat being the most common, followed by the 5-repeat, 6-repeat, 7-repeat or 8-repeat unit.12–19 Repeat range was higher in a Dutch study of 363 FTD patients where 10% had an FTD-ALS syndrome, with FTD cases showing mean (SD) size of 13.9±14.0 repeats (range 1–64) compared with 9.1±6.8 (range 2–35) in controls. Notably, a proband in an FTD-ALS family carried 26 repeats, with affected family members carrying a repeat range of 8–29.20 Pathologically confirmed corticobasal degeneration (n=18) and progressive supranuclear palsy (PSP) (n=177) cases carried repeat lengths ranging from 2 to 22, with one clinically diagnosed ‘typical’ PSP case with family history of dementia and parkinsonism carrying 27 repeats.21 In sporadic ALS, repeat length and allele frequency distribution appear to be similar between cases and controls.12 13 22–28 This was similarly seen in familial ALS,26 28 29 but a Japanese study reported slightly longer repeat lengths in familial ALS non-expansion samples (5.90±3.91 repeats) carrying the risk ‘A’-allele compared with those without (3.36±2.13 repeats).27 In sporadic ALS, no correlation between lengths of the shortest, longest and sum of both alleles without the repeat expansion with age of onset, survival or region of disease onset was seen.24
Nineteen studies of PD and atypical parkinsonism cases were identified. Nine PD studies reported range of repeat lengths similar between PD cases and controls,14 16 17 30–35 with no correlation between repeat length and family history or age of disease onset.35 No evidence of association between repeat length and age-at-onset or risk of PD, essential tremor (ET) or restless legs syndrome after correction for multiple testing (p<0.0028) was seen,31 and the length of the longest non-expanded repeat allele did not correlate with age at onset (p=0.11) in PD.32 None of 29 LRRK2 mutation carriers screened for C9orf72 had an expanded or intermediate allele (<11 repeats).16 Twenty-three cases of classical PD harbouring larger-than-normal alleles (21–38 repeats) have been reported,14 16 34 36 including two with severe dementia16 and one young-onset ‘typical’ PD,14 suggesting that C9orf72 intermediate repeats may contribute to risk for PD and ET plus parkinsonism.34 However, intermediate allele frequency appeared to be similar to that of controls (2.2% of PD vs 2.9% in controls).30 Using a lower cut-off of <20 for intermediate repeat length, Nuytemans et al reported more intermediate alleles in PD (2%) compared with controls (0.3%), with intermediate repeats significantly associating with increased PD risk (p=0.008) at an OR of 9.6, but with an extremely wide CI due to rarity of the intermediate repeat. Thirteen intermediate repeat patients demonstrated a classical PD phenotype responsive to levodopa, with no suggestion of ALS or family history of ALS.34 Overall, the small number of cases reported and the occurrence of intermediate repeats in some controls suggests that C9orf72 intermediate repeats are most likely a susceptibility factor or arise from phenotypic heterogenity, rather than causal for PD.34 This was confirmed when no interm￼ediate or expanded repeats was found in an autopsy-confirmed PD data set.37 Overall, only a small number of large expansions (>30 repeats) have been found in PD, suggesting that >30 repeats are not a common cause of PD (0.2% of 3500 tested).37 Finally, a meta-analysis by the Genetic Epidemiology of Parkinson's disease (GEo-PD) consortium cohort of 7494 patients with PD revealed a small increase in PD risk with an increasing number of C9orf72 repeats, but no robust association was detected. The cohort size suggested an association for the 10-unit repeat and for pooled alleles of ≥17 repeats in increasing PD risk, but this did not reach significance after correction for multiple testing.38
In nine atypical parkinsonian disease studies, including pathologically confirmed multiple system atrophy (MSA)21 39 and diffuse Lewy body dementia (DLB)40 41 cases, pathogenic expansions were rare, and repeat range of non-expansion alleles were similar to that of controls.14 17 21 30 33 39 42 Two pathologically diagnosed DLB samples carried 32 repeats and one with 33 repeats, although no details were provided regarding presence or absence of TDP43 pathology in these patients.41 Intermediate repeats (20–29 units) were significantly more frequent in atypical parkinsonism cases than controls (p<0.034) and were seen in four (4.3%) female patients (with 20, 22, 23 and 28 repeats), three of whom had non-classical atypical parkinsonism.42 PSP patients with intermediate repeat lengths of 26 and 30 were reported, but it is unclear if the repeat expansions and clinical symptoms are related.33 Five studies of Alzheimer’s disease (AD), including one study with 80% of subjects under age 65,43 were included.16 32 43–45 Within the normal and intermediate repeat range, higher repeat numbers did not increase risk for AD,16 32 or interact with apolipoprotein E (APOE) genotype.44 No significant difference in the frequency of short (<7) and intermediate (>7) repeats was seen between AD and mild cognitive impairment (MCI) subjects and controls; MCI cases with intermediate repeats were not at greater risk for AD,45 and intermediate allele sizes had no influence on cerebrospinal fluid core AD biomarkers.45
Intermediate C9orf72 alleles may be associated with a higher frequency of psychiatric symptoms
Four FTD families were reported to carry 20-repeat or 22-repeat alleles associated with the surrogate founder haplotype, segregating consistently in all affected siblings, with unaffected siblings carrying wild-type alleles (two to nine repeats).8 Most of the nine carriers with >20 to >30 repeats had extended periods of psychiatric symptoms and subjective cognitive complaints before clear neurological deterioration. In other studies, a 29-repeat allele was found in a patient from a Dutch family with C9orf72-related FTD-ALS,20 and four patients with limb-onset ALS with intermediate repeats (range 20–22) had lower mean (SD) age of diagnosis compared with patients with <20 repeats (47.6±15.9 vs 62.8±11.2 years).9 Both patients in the latter study with 22 repeats exhibited cognitive and behavioural impairment similar to larger expansions and had family history of FTD, while two patients with 20 and 21 repeats had family history of dementia and psychiatric illness. In atypical PD or PD with psychosis, intermediate expansions (20-29) were detected in three female cases with atypical parkinsonian syndromes (severe rigid akinetic parkinsonism) and neuropsychiatric symptoms including schizoaffective psychosis and an FTD-like dementia.42 Unfortunately, these cases did not have pathological diagnosis. In schizophrenia, the pathogenic repeat expansion was detected in only two patients (0.67%), but the estimated number of repeats in controls appeared lower (2–5 units) than in schizophrenic patients without the expansion (2–30 units).46 Only 4/130 (3%) patients with psychosis from the Northern Finland Birth Cohort had intermediate repeat sizes (17–26 units).47 An Irish psychosis cohort found seven samples with >22 repeats including two schizophrenia cases (with 28 and 27 repeats) and five controls (with 23–26 repeats), but with no overall evidence of association between repeat length and schizophrenia.48 Unfortunately, detailed phenotypic description of these intermediate carriers remains of variable quality across studies, both for motor and non-motor signs. Overall, it remains possible that the pathological cut-off for C9orf72 expansions is disease dependent, modulated by unknown or individual genetic factors, with intermediate allele sizes potentially predisposing to neuropsychiatric sequelae.
Repeat lengths in the normal and intermediate range may be haplotype dependent
The full background 82-SNP ‘founder’ haplotype (r) from which the C9orf72 expansion arose is present in almost all populations studied in the ‘1000 Genome’ database (~15% of European controls).49 Nearly 50% of individuals with non-(r) haplotypes carry up to two copies compared with only 5% of those with the (r) haplotype, who carry eight repeats or more. In Irish psychosis patients, 78% of samples with >6 repeats carried the founder haplotype compared with only 3% with <6 repeats.48 A meta-analysis of genome-wide association study data from five European populations reduced the ‘r’ haplotype to a common 20-SNP ‘risk’ haplotype (rs3849942, allele ‘A’) in significant association with FTD and ALS,50 and controls have shown significantly longer median repeat length (eight units) on the ‘risk’ haplotype compared with the wild-type haplotype (rs3849942 allele ‘G’; 2 units).1 In PD, the rs3849942 allele was confirmed in 100% of all intermediate carriers, with the tagging (T) allele seen in 95% of all individuals with ≥8 repeats, versus 10% in those with <8 repeats.34 Pathologically confirmed PD cases identified the T-allele in 40% of patients, and >90% of T-allele carriers carried >8 repeats.37 Japanese ALS patients carrying at least one rs3849942 allele had significantly longer repeats than those without.27
Possible mechanistic pathways involving intermediate alleles
Intermediate alleles are ‘predisposing’ alleles that lead to large expansions over many subsequent generations
C9orf72 alleles at <20 repeats appear stable between generations, while small expansions (20–150 repeats) are susceptible to unfaithful inheritance4 or somatic instability (figure 1).20 Intermediate repeats may be prone to gaining and losing copies as opposed to longer repeats whose length mostly increases.34 In French control samples, the largest repeat (22 units) changed size twice in the same family with all intergenerational changes occurring from a starting length of >10 repeats on the same rs3849942A ‘risk’ haplotype, suggesting that either repeat length and/or risk haplotype confer instability.4 Increased frequency of short indels in the GC-rich low complexity sequence adjacent to the expanded repeat in C9orf72 carriers suggests that subsequent pathological expansion may be due to replication slippage.51 However, the younger generation in 20 families of index patients with intermediate alleles of up to 22 units did not show longer intermediate repeat lengths or pathological expansions, suggesting that intermediate length might be ‘predisposing’ rather than ‘pre-mutation’ alleles for further stepwise expansion over many generations.51 Large-scale multigenerational cohorts will be required to determine how many repeats constitute such an allele.
Intermediate alleles may display increased methylation but at lower levels than large expansions
Hypermethylation of the CpG island 5′ to the C9orf72 GGGGCC repeat has been associated with expansion in ALS, but methylation changes have not been detected in either normal or intermediate alleles (up to 43 repeats), which questioned the cut-off of 30 repeats for pathogenicity.52 Comparison of methylation levels of homozygous normal intermediate repeat carriers (I/I) (7–24 units) with homozygous normal short repeat carriers (S/S) (2–6 units) in ALS and FTD-ALS demonstrated a slight but significant increase in methylation from I/I samples compared with S/S samples in both patient and control groups, with higher methylation levels in I/I carriers derived mostly from carriers with at least one allele of >8 units. Intermediate wild-type alleles of expansion carriers were also significantly more methylated than normal short wild-type alleles (p<0.0001). Notably, 5′ CpG and GGGGCC methylation was significantly higher in brain than in the blood of patients with normal repeats (p<0.0001).53 Given that the degree of methylation remained very low (<5%) in samples without long expansions versus ~10% in expansion carriers, further work needs to be done to elucidate any epigenetic modulation in intermediate C9orf72 allele carriers and the minimum repeat cut-off required for this to occur.
Intermediate alleles affect normal transcriptional activity of the C9orf72 promoter
In vitro reporter gene expression studies have shown significantly decreased transcriptional activity of the C9orf72 promoter with increasing number of normal and intermediate repeats, with a 24-unit containing promoter showing >50% reduction in transcription compared with 2 units on the wild-type allele.51 In human kidney and neuroblastoma cell lines (HEK293T cells), intermediate repeats (7–24) have shown significantly decreased C9orf72 promoter activity compared with normal short repeats (2–6 units),53 with lower transcriptional activity appearing more prominent in the presence of small deletions flanking the GGGGCC unit. Overall, these findings suggest that increasing repeat length can lead to decreased promoter activity via increased methylation of CpG sequences in larger repeats, with transcriptional silencing of the promoter.53
Minimum repeat length in blood corresponding with large expansions in CNS tissue
The minimum repeat length required for disease is not known due to somatic instability, with variation in length between tissues, even within the same individual, complicating genotype–phenotype studies (table 1).54 Subtle differences in repeat length within the same patient and between monozygotic twins strongly suggest that random, stochastic expansion events may occur during cell division, resulting in somatic and/or germline mosaicism contributing to intraindividual and interindividual repeat variation.55 Limited data show that repeat lengths from ≤15 to 27 in the blood do not show somatic instability.56 57 Postmortem studies involving a wide range of CNS tissue examination in cases with large-normal and intermediate repeats lengths in peripheral blood are needed.
Variation of large-normal and intermediate repeat lengths by ethnicity
In the GEO-PD consortium, Asian controls appeared to carry smaller normal repeat lengths (7–14) compared with Caucasian controls (0–32).38 Caucasian alleles frequently contain up to 43 repeats,4 52 but in Chinese samples, >12 repeats are rarely observed.25 The European haplotype does exist in Asian cohorts25 27 49 but at a lower frequency (<2% in Chinese controls compared with 9% in European controls).25 Data from the 1000-genomes project estimate the frequency of the European founder haplotype in 15% of Europeans compared with only 0.5% in Asians.49 Two Han Chinese ALS expansion carriers had methylation of the upstream CpG island but did not carry the founder haplotype,25 implying that C9orf72 alleles may undergo dynamic mutation independent of the European haplotype. This ‘Chinese haplotype’ was also associated with ≥8 repeats (20% vs 8% in other haplotypes), implying that both the Scandinavian and Chinese haplotypes may confer repeat length instability.25 It remains possible that the haplotype may be genetic background for expansion carriers and not necessarily associated with the expansion, as none of 53 FTD patients recruited in the USA with diverse ethnic backgrounds shared the exact same Scandinavian risk haplotype.11 Multicentre collaborative efforts pooling Asian samples are needed to determine if the lower incidence of pathogenic C9orf72 expansions in multiple Asian cohorts is due to shorter Asian repeat lengths that may be at lower risk for instability.
‘Critical’ repeat size required for initiation of neurodegeneration remains unknown
Toxic gain-of-function by dipeptide repeat proteins (DPRs) and/or RNA foci generated by unconventional repeat-associated non-ATG translation and TDP43 inclusions remains a possible mechanism for C9orf72 pathogenesis.3 Cognitively normal cases from the Queen Square Brain Bank screened for C9orf72 expansions found one 84 year-old subject with intermediate repeats (20–40) showing TDP43-positive lesions, sense and antisense RNA foci in the frontal cortex and p62-positive inclusions containing all five DPRs, but more sparsely compared with FTD cases with large expansions.58 DPR inclusions and RNA foci were, however, notably absent in two pathologically normal cases with 20 repeats, suggesting it may not be long enough to trigger the entire disease cascade necessary for TDP43 pathology. TDP43 formation may also represent ‘age-related’ phenomena, and pathological studies have found no expansion or intermediate alleles (20–29 repeats) among LRRK2 G2019S carriers and AD cases who had concomitant TDP43-positive inclusions.16
Effect of homozygosity versus heterozygosity of normal and intermediate alleles
In pathogenic expansion carriers, repeat size of the second allele appear to be within the normal range (2–11) and unlikely to contribute to disease given that no variability in the regions flanking the hexanucleotide repeat was found.16 Homozygous intermediate repeat carriers have been reported more frequently in FTD subjects compared with controls (6.1% vs 4.6%),51 but in PD cases with >20 repeats, all but two of 407 intermediate repeat carriers were heterozygous for the intermediate allele and a low copy number allele, with only one sample homozygous for intermediate repeats. This proband carried three different maximum intermediate repeat alleles in two offspring and herself, developing ET at age 20 with mild parkinsonism at 97 years old.34 As described earlier, higher methylation levels were seen in homozygous intermediate allele compared with homozygous short allele carriers.53
Future directions and conclusions
Further steps for elucidating the uncertain role(s) of intermediate C9orf72 hexanucleotide repeats include (table 2): (1) common centralised genotyping methodology for comparing allele size accurately. The common rp-PCR method cannot reliably distinguish between repeat sizes >30–60 units2 and is inadequate when used without confirmation by Southern blot, as confirmed by a masked study at 14 laboratories showing lack of accuracy when rp-PCR was used in isolation.59 Southern blot analysis, however, is affected by age and somatic heterogeneity, hence for smaller range repeats, sequencing to determine the actual size may be more useful and will also provide the actual DNA composition of the repeat; additionally, Southern blot results should be interpreted in conjunction with recent improved rp-PCR techniques60; (2) deciphering DNA composition within expanded repeats will be important, as the ‘critical repeat size’ may not be key, but rather the exact composition within the expanded repeats, including the presence or absence of stabilising interruptions; (3) developing novel genotyping methods for elucidating the role of other GGGGCC loci and repeat expansions in the human genome as well as other pathogenic genes in modulating the risk of C9orf72-linked disorders; (4) developing a fast comprehensive systematic screen for repeat expansions on a genome-wide level; (5) development of novel cellular and in vivo models given that current ones do not fully recapitulate clinical disease. The wide range of pathogenic GGGGCC repeat sizes and the large size of the repeat region, combined with the fact that it is entirely GC makes the generation of cell models technically difficult. Patient-derived induced pluripotent stem cell neurons harbouring different repeat sizes and transgenic non-human primates can provide additional insights.
Many studies involved retrospectively collected samples or were based on limited clinical information, rendering clinical phenotyping of intermediate allele carriers less accurate. Efforts to ensure detailed clinical and neuropsychological profiling in symptomatic and asymptomatic carriers and healthy controls remains vital; accurate phenotype–genotype correlations within large multigeneration families can also aid in minimising the effect of confounders and genetic background heterogeneity. Finally, in genetic association studies, sample size remains critical. Thus, a global multicentre approach with standardised clinical and neuroimaging protocols and a centralised genotyping laboratory will aid in identifying rare associations and minimise diagnostic errors, as well as investigate differences according to ethnicity, potentially unravelling new clues concerning the role of intermediate C9orf72 alleles.
Contributors A SL Ng conducted the literature search and drafting of the manuscript. EK Tan revised the manuscript for intellectual content and provided the outline for the draft. A SL Ng and EK Tan drafted and approved the manuscript.
Funding A SL Ng is supported by the Singapore Ministry of Health’s National Medical Research Council (NIG grant) and the SingHealth Foundation Award (PRISM grant). EK Tan is supported by the Singapore Ministry of Health’s National Medical Research Council (STaR and Parkinson’s Disease TCR grants).
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.