Article Text

other Versions

Download PDFPDF

Short report
Reduced penetrance of gene variants causing amyotrophic lateral sclerosis
  1. Andrew G L Douglas1,2,3,
  2. Diana Baralle4
  1. 1Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
  2. 2Human Development and Health, University of Southampton Faculty of Medicine, Southampton, UK
  3. 3Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
  4. 4Human Genetic and Genomics, University of Southampton, Southampton, UK
  1. Correspondence to Dr Andrew G L Douglas, Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, UK; andrew.douglas{at}


Background Amyotrophic lateral sclerosis overlaps aetiologically and genetically with frontotemporal dementia and occurs in both familial and apparently sporadic forms. The most commonly implicated genes are C9orf72, SOD1, TARDBP and FUS. Penetrance of disease-causing variants in these genes is known to be incomplete, but has not been well studied at population level.

Objective We sought to determine the population-level penetrance of pathogenic and likely pathogenic variants in genes commonly causing amyotrophic lateral sclerosis.

Methods Published epidemiological data for amyotrophic lateral sclerosis and frontotemporal dementia were used to calculate expected frequencies of disease-causing variants per gene at population level. Variant data from gnomAD and ClinVar databases were used to ascertain observed numbers of disease-causing variants and to estimate population-level penetrance per gene. Data for C9orf72 were obtained from the published literature.

Results Maximum population penetrance for either amyotrophic lateral sclerosis or frontotemporal dementia was found to be 33% for C9orf72 (95% CI (20.9 to 53.2)), 54% for SOD1 (95% CI (32.7 to 88.6)), 38% for TARDBP (95% CI (21.1 to 69.8)) and 19% for FUS (95% CI (13.0 to 28.4)).

Conclusion Population-level penetrance of amyotrophic lateral sclerosis disease genes is reduced. This finding has implications for the genetic testing and counselling of affected individuals and their unaffected relatives.

  • Motor Neuron Disease
  • Genetic Predisposition to Disease
  • Dementia
  • Genetics, Population

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease affecting upper and lower motor neurons within the brain and spinal cord.1 Despite significant advances in the understanding of its molecular pathogenesis, ALS remains an incurable condition with a peak of incidence between 60 and 75 years and a median survival of only 2–4 years after onset.2 3 Between 5% and 10% of cases are reported to be familial, displaying autosomal dominant inheritance with incomplete penetrance.4 5 Pathogenic variants in over 30 different genes have been implicated as causing familial ALS (fALS), with 4 genes in particular (C9orf72 (MIM: 614260), SOD1 (MIM: 147450), TARDBP (MIM: 605078) and FUS (MIM: 137070)) contributing to 55% of European ancestry ALS families.6 7 However, a notably consistent finding has been the identification of pathogenic variants in the same genes contributing to apparently sporadic ALS (sALS), with around 7% of European cases linked to C9orf72, SOD1, TARDBP and FUS.7

ALS has a worldwide incidence of 1.75 per 100 000 and a lifetime risk of 1 in 347 for men and 1 in 436 for women.8 9 It overlaps substantially in its disease spectrum with frontotemporal dementia (FTD), with up to 23% of ALS patients also having FTD and 12.5% of FTD patients having ALS.10 11 A recent large cohort study across 9 European countries has calculated the incidence of FTD to be 2.36 per 100 000, although previous estimates have ranged from 2.7 to 4.1 per 100 000.12 13 Around 10% of FTD cases display clear autosomal dominant inheritance and the most commonly identified genes are C9orf72, MAPT and GRN, which again contribute to both familial FTD (fFTD) and sporadic FTD (sFTD) cases.14 15

The C9orf72 (GGGGCC)n repeat expansion is known to be over-represented within European populations, where it has been identified in 0.15%–0.2% of healthy controls.16 17 Given the relatively well-preserved proportions in contribution of C9orf72, SOD1, TARDBP and FUS to both fALS and sALS, it is reasonable to expect the presence of pathogenic variants in these and other ALS genes to be similarly over-represented within population cohorts and for disease penetrance to be reduced to a similar degree to C9orf72. In this study, we sought to estimate the expected frequencies of ALS-causing pathogenic variants at a population level based on their reported occurrences within familial and sALS cohorts and to compare this to the corresponding observed frequencies within a publicly available genomic database. Using these data, it has been possible to estimate the population-level penetrance of the most common ALS-causing genes.


Figures for ALS and FTD incidence and proportions of familial and sporadic cases caused by variants in specific genes were obtained from the published literature.6 15 Observed allele frequencies and carrier frequencies (pO) of pathogenic and likely pathogenic variants were calculated using the gnomAD v.2.1.1 database ( with reference to variant classifications provided by the ClinVar database ( 19 For each gene, the largest allele number from the set of pathogenic or likely pathogenic variants seen in gnomAD was used as the denominator for determining allele frequencies. The ‘non-neuro’ dataset selection was applied to gnomAD variants so as to omit data from individuals ascertained as having a neurological condition in neurological case/control studies. Variants were only counted as disease causing if they were listed as pathogenic or likely pathogenic in ClinVar (variants were disregarded if of uncertain significance or with conflicting interpretations or low-confidence flags).

If ALS genes were fully penetrant, the expected pathogenic variant carrier frequency in the population for a given gene (pE) would equal the lifetime risk of ALS (RA) multiplied by the sum of the proportion of cases that are fALS (FA) multiplied by the reported frequency of variants causing fALS (pFA) added to the proportion of cases that are sALS (1−FA) multiplied by the reported frequency of variants causing sALS (pSA). In order to account for the additional overlap in contribution to FTD of many ALS genes, an identically formulated expression can be added to this equation but using figures for lifetime risk of FTD (RD), proportion of fFTD (FFD) and sFTD (1−FFD) cases and the associated frequencies of causative variants in fFTD and sFTD (pFD and pSD). Thus, the expected frequencies of disease-causing variants can be calculated as follows:

Embedded Image

Penetrance (K) can then be estimated as the reciprocal of the observed carrier frequencies divided by the expected carrier frequencies, which is more naturally expressed as the expected divided by the observed carrier frequencies:

Embedded Image

In the calculations of penetrance undertaken, lifetime risks of 1 in 386.4 for ALS (combining both male and female incidence) and 1 in 305 for FTD have been used and fALS and fFTD are taken to represent 1 in 10 cases. The lifetime FTD risk has been estimated based on the upper limit incidence of 2.7–4.1 per 100 000, which if applied to an average population lifespan of 80 years would equate to 1 in 305–463 people.


The results of this study are summarised in table 1 and figure 1. For SOD1, reported in 12% of fALS and 2% of sALS cases, disease-causing variants are expected to occur in 1 in 12 880 people. However, a population frequency of 1 in 6938 people in gnomAD were found to be heterozygous for such variants. Of note, the relatively common p.(Asp91Ala) variant responsible for a recessive form of ALS was excluded on account of its conflicting interpretations of pathogenicity on ClinVar (though no homozygotes were present). This result therefore suggests a maximum penetrance for SOD1 of 54% at population level. For TARDBP and FUS, both reported to account for 4% of fALS, 1% of sALS and 1% of fFTD, disease-causing variants are expected to occur in 1 in 27 084 people. However, such variants are actually observed in at least 1 in 10 406 and 1 in 5202 people, respectively, giving TARDBP an estimated population penetrance of 38% and FUS 19%.

Table 1

The expected and observed frequencies of disease-causing variants in ALS-related genes and the maximum estimated disease penetrance at population level

Figure 1

Expected and observed deleterious allele frequencies for genes commonly involved in amyotrophic lateral sclerosis (ALS).

Although gnomAD does not include data on the C9orf72 repeat expansion, a similar calculation can be performed using previously reported figures for its prevalence.16 17 Based on the contribution of C9orf72 in up to 40% of fALS, 7% of sALS, 25% of fFTD and 6% of sFTD, the expected carrier frequency would be 1 in 1903 people. However, its reported prevalence in 5/2585 and 11/7579 healthy controls (totalling 17/10163 or 1 in 635 people) would suggest a maximum population penetrance of 33%.


Individuals predisposed to late-onset neurodegenerative diseases such as ALS are subject to age-related cumulative risks of developing these disorders. Genetically determined forms of ALS will therefore always exhibit some degree of incomplete penetrance owing to confounding variables that may affect the lifespan of at-risk individuals. This can often be seen when assessing an affected individual’s family history, for example, where an intervening relative who is an obligate gene carrier has died at a relatively young age and thus would appear to have been unaffected, potentially leading to underestimation of penetrance. To complicate matters further, there is currently no reliable way to ascertain whether or not a deceased at-risk individual (eg, a carrier of an apparently pathogenic SOD1 variant) would have gone on to develop neurodegeneration. This makes measuring the true penetrance of genes predisposing to ALS problematic.

This study has sought to estimate ALS gene penetrance by reference to published and publicly available information. The calculated penetrance estimates depend heavily on the accuracy of the epidemiological figures employed. A number of additional inherent limitations also apply. First, there are undoubtedly multiple and substantial non-genetic and environmental influences on ALS and we have not sought to address these factors in this purely genetic analysis. Second, this study assumes that the population in gnomAD is an accurate surrogate for the general population. Third, no allowance has been made for different rates of ALS gene variants across different ethnic groups, for which good evidence exists.7 Fourthly, the analysis assumes that ClinVar annotations are accurate and does not take into account the substantial numbers of variants present in gnomAD that are not present in ClinVar and which therefore have not been classified. Finally, by performing analysis at the gene level, no consideration has been given to the possible variability in penetrance that might be attributable to specific gene variants.

In defence of the relatively simplistic approach used here, it should be noted that gnomAD v.2.1.1 does indeed have a large preponderance of data from European ancestry individuals.18 The ALS gene contribution figures used have also been largely ascertained in respect to European ancestry cohorts and so these are likely to be applicable.6 We have sought to use generally accepted figures for percentage gene contributions to ALS and FTD but recognise that uncertainty exists, particularly for figures under 1% and so we have not included values below 1% and similarly have limited the analysis to only the four most commonly involved ALS genes. In addition, we have sought to maximise the expected number of variants (and hence to maximise the calculated gene penetrance) by using a fALS rate of 10% rather than 5% and by including lifetime risks of ALS (1 in 386) and FTD (1 in 305) that are towards the upper limits for what has been reported or calculated according to incidence. Furthermore, by only counting variants classified unambiguously on ClinVar, we believe that we have in fact underestimated the true frequencies of observed pathogenic and likely pathogenic variants, which if included would further increase the observed/expected variant ratios and therefore decrease the penetrance of each gene even further. By limiting our analysis to ALS and FTD, we have precluded the possibility of other forms of neurodegeneration being caused by these gene variants. However, widely accepted figures for such contributions are generally not available and we have therefore chosen to omit these. With regard to variant-level effects, such differences are indeed likely to exist and further research into the reasons for genotype-specific disease variability may prove fruitful in terms of understanding the pathogenic mechanisms underlying genetic forms of ALS and FTD. However, the results of this study would tend to caution against the automatic attribution of penetrance figures to individual variants, since the implication of this work is that penetrance is not an intrinsic characteristic of a given gene or variant but rather is modifiable and dependent on other genetic and non-genetic factors. In any case, the numbers of variants in the gnomAD cohort are too low to allow adequate analysis to this level of detail and this will only start to become possible with reference to genomic data cohorts that include millions of individuals.

In this study, we have identified that pathogenic and likely pathogenic variants in commonly identified genes linked to ALS are over-represented in control populations and therefore display reduced penetrance at the population level. This finding is of relevance to understanding the aetiology and pathogenesis of ALS and FTD and raises the question of what factors, genetic or otherwise, are influencing the penetrance in this way. It also has important clinical implications for those undertaking genetic testing of individuals affected by ALS and FTD and even more so for the predictive genetic testing of unaffected relatives. Now that clinical trials are taking place to target SOD1 and C9orf72 gene carriers, it is becoming critical to understand who will and who will not go on to develop disease and hence who is likely to benefit most and least from such treatments.20 Importantly, it should be noted that the penetrance figures reported here apply only to the population as a whole, not to the individual. It would therefore not be appropriate to counsel affected individuals and their relatives based solely on these figures. Individual carriers of disease-causing variants in these genes are likely to have individualised risk profiles and this should be considered carefully when making clinical decisions about predictive testing. Until such time as the other relevant factors influencing ALS gene penetrance become known and their relative contributions understood, an individual’s risk will likely best be determined by reference to his or her own family history of the disease.

Ethics statements

Patient consent for publication

Ethics approval

This research has been done solely using previously published information and free access to publicly available aggregate genomic data, which is not identifiable.


The authors would like to acknowledge all the contributors to the gnomAD and ClinVar databases, without which this study would not have been possible.



  • Correction notice This article has been corrected since it was published Online First. The funding statement has been amended for clarity.

  • Contributors Conceptualisation, resources and writing—review and editing: AGLD and DB. Data curation, formal analysis, investigation, methodology, project administration, software, validation, visualisation and writing—original draft: AGLD. Funding acquisition and supervision: DB.

  • Funding This research has been funded by a National Institute for Health Research (NIHR) Research Professorship awarded to DB (RP-2016-07-011). This work was supported by the NIHR Biomedical Research Centre, Oxford.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.