Article Text

Download PDFPDF

Original research
Evidence for polygenic and oligogenic basis of Australian sporadic amyotrophic lateral sclerosis
Free
  1. Emily P McCann1,
  2. Lyndal Henden1,
  3. Jennifer A Fifita1,
  4. Katharine Y Zhang1,
  5. Natalie Grima1,
  6. Denis C Bauer2,
  7. Sandrine Chan Moi Fat1,
  8. Natalie A Twine1,2,
  9. Roger Pamphlett3,4,5,
  10. Matthew C Kiernan4,6,
  11. Dominic B Rowe1,7,
  12. Kelly L Williams1,
  13. Ian P Blair1
  1. 1 Macquarie University Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, New South Wales, Australia
  2. 2 Transformational Bioinformatics, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales, Australia
  3. 3 Discipline of Pathology and Department of Neuropathology, The University of Sydney, Sydney, New South Wales, Australia
  4. 4 Brain and Mind Centre, The University of Sydney, Sydney, New South Wales, Australia
  5. 5 Department of Neuropathology, Royal Prince Alfred Hospital, Sydney, New South Wales, Australia
  6. 6 Institute of Clinical Neurosciences, Royal Prince Alfred Hospital, Sydney, New South Wales, Australia
  7. 7 Department of Clinical Medicine, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, New South Wales, Australia
  1. Correspondence to Dr Ian P Blair, Biomedical Sciences, Macquarie University Faculty of Medicine and Health Sciences, Sydney, NSW 2109, Australia; ian.blair{at}mq.edu.au

Abstract

Background Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with phenotypic and genetic heterogeneity. Approximately 10% of cases are familial, while remaining cases are classified as sporadic. To date, >30 genes and several hundred genetic variants have been implicated in ALS.

Methods Seven hundred and fifty-seven sporadic ALS cases were recruited from Australian neurology clinics. Detailed clinical data and whole genome sequencing (WGS) data were available from 567 and 616 cases, respectively, of which 426 cases had both datasets available. As part of a comprehensive genetic analysis, 853 genetic variants previously reported as ALS-linked mutations or disease-associated alleles were interrogated in sporadic ALS WGS data. Statistical analyses were performed to identify correlation between clinical variables, and between phenotype and the number of ALS-implicated variants carried by an individual. Relatedness between individuals carrying identical variants was assessed using identity-by-descent analysis.

Results Forty-three ALS-implicated variants from 18 genes, including C9orf72, ATXN2, TARDBP, SOD1, SQSTM1 and SETX, were identified in Australian sporadic ALS cases. One-third of cases carried at least one variant and 6.82% carried two or more variants, implicating a potential oligogenic or polygenic basis of ALS. Relatedness was detected between two sporadic ALS cases carrying a SOD1 p.I114T mutation, and among three cases carrying a SQSTM1 p.K238E mutation. Oligogenic/polygenic sporadic ALS cases showed earlier age of onset than those with no reported variant.

Conclusion We confirm phenotypic associations among ALS cases, and highlight the contribution of genetic variation to all forms of ALS.

  • genetics
  • motor neurone disease
  • molecular genetics
  • neurosciences

Statistics from Altmetric.com

Introduction

Amyotrophic lateral sclerosis (ALS) is a late-onset fatal neurodegenerative disease characterised by the degeneration of both upper and lower motor neurons.1 Patients develop muscle weakness, wasting, spasticity and eventual paralysis. ALS displays significant phenotypic heterogeneity, even among cases with identical causal gene mutations. Disease onset can occur anywhere from the second to ninth decade of life but most frequently between 50 and 60 years of age.2 Around 70% of cases present with limb onset and 25% with bulbar onset, while rare cases show truncal onset.3 The median disease course is 3 years,2 however survival can be as short as 12 months and may exceed 20 years.2 4 Around 10%–15% of ALS cases are diagnosed with comorbid frontotemporal dementia (FTD), with up to 50% developing some cognitive impairment.5 Approximately 10% of ALS cases have a family history of disease (familial ALS (FALS)) and two-thirds of these cases carry a reported ALS gene mutation.6 7 The remaining 90% of cases have no apparent family history and are classified as sporadic ALS (SALS).

Extensive genetic heterogeneity is apparent among ALS cases. To date, at least 31 genes have been linked or associated with ALS. Heritability studies suggest that 40%–60% of SALS risk may be explained by genetic factors.8–10 A polygenic basis to SALS has been implicated by the co-occurrence of two or more ALS gene variants in a single individual.11–14 These variants likely interact with environmental factors to trigger the development of ALS.15 Alternatively, this may indicate an oligogenic disease model,11 where an ALS mutation of large effect, such as SOD1 or C9orf72, is inherited with another ALS gene variant which may also contribute to phenotype. Furthermore, a multistep hypothesis has recently been postulated to explain the late and apparently sporadic onset of ALS.16 This hypothesis postulates that six ‘steps’ are required to trigger ALS onset, where such steps may include genetic predisposition, environmental exposures or other unknown molecular alterations.16 Within this hypothesis, known genetic mutations may account for multiple steps, to a degree reflective of their penetrance.17

Previously, our laboratory described the genetic and phenotypic heterogeneity of Australian familial ALS.7 Here, we report the extent of phenotypic and genetic variation among Australian SALS cases. To interrogate the genetic architecture of Australian SALS, we first compiled a list of 31 genes, and >850 genetic variants previously reported as: 1) ALS-linked mutations (typically identified from family based studies); 2) functional risk alleles (variants whose functions are thought to increase disease susceptibility) or 3) otherwise associated variants (including variants that may be in linkage disequilibrium with unknown functional risk alleles). Among a large Australian SALS cohort, 35.39% of cases carried a variant previously reported in ALS, implicating 18 genes as contributing to disease in this population. Polygenic inheritance was implicated in 6.82% of SALS cases who harboured more than one ALS-implicated variant. We also showed that the clinical heterogeneity of Australian SALS is similar to that observed in cohorts from other populations.

Methods

Subjects

A total of 757 SALS cases were recruited from the Macquarie University Neurodegenerative Disease Biobank, Australian MND DNA Bank (Royal Prince Alfred Hospital) and Brain and Mind Centre (University of Sydney). All participants provided informed written consent as approved by the human research ethics committees of Macquarie University, Sydney South West Area Health District or The University of Sydney. All participants were of Caucasian descent (as established according to online supplementary file 1), each was clinically diagnosed with definite or probable ALS according to El Escorial criteria,18 and had no known relatives affected by ALS and/or FTD. Of these 757 cases, detailed clinical data were available for 567 individuals, and 616 cases underwent whole genome sequencing (WGS), with a total of 426 having both detailed clinical and WGS data available. Genomic DNA was extracted from peripheral blood using standard protocols. Fifteen of these SALS cases were previously reported to carry an expansion in C9orf72.7 14

Supplemental material

Statistical analysis of clinical variables

Clinical records were examined for four phenotypic features: sex, age at disease onset, site of onset (bulbar or spinal) and duration of disease from onset (until death or last known date of survival). All statistical analyses were performed in R (V.3.5.1). Statistical analyses were performed in a pairwise fashion between all four clinical variables to identify significant associations. A χ2 analysis was performed between sex and site of onset, while Welch’s t-tests were performed between age of onset and both sex and site of onset. Kaplan-Meier survival analyses were performed between disease duration and both sex and site of onset. Additionally, a linear regression model was fitted between age of onset and duration (for deceased cases only). Multiple testing was accounted for using a Bonferroni corrected significance threshold of p<0.008, with α=0.05 and 6 comparisons.

We also assessed whether the number of ALS-implicated variants carried by an individual influenced their clinical presentation. Multinomial logistic regression analysis was performed for each clinical variable to compare cases carrying two or more ALS-implicated variants with cases carrying only one ALS-implicated variant or no ALS-implicated variants.

Generation of WGS data

DNA samples underwent library preparation using the TruSeq PCR-free library preparation kit (Illumina, V.2.5). Prepared libraries then underwent multiplex 150 bp paired-end sequencing using an Illumina HiSeq X Ten instrument (Kinghorn Centre for Clinical Genomics, Sydney, Australia). Raw sequencing reads were processed using the genome analysis toolkit and the associated best practices.19–21 Detailed methodology can be found in online supplementary file 1.

Survey of reported genetic variants implicated in ALS

A list of 31 ALS genes was established (online supplementary table 1) based on the 28 ALS genes reported by Chia et al,22 with the addition of the recently reported ALS genes TIA1 23 and KIF5A,24 as well as the GPX3-TNIP1 locus.25 A comprehensive literature search was conducted in order to compile an exhaustive reference list of 853 genetic variants previously implicated in ALS (online supplementary file 2), including ALS-linked mutations, functional risk alleles and disease-associated variants. This involved performing a PubMed (https://www.ncbi.nlm.nih.gov/pubmed/) search for each ALS gene name and ‘amyotrophic lateral sclerosis’, with subsequent manual evaluation of all resulting publications. The ALSOD database (http://alsod.iop.kcl.ac.uk/home.aspx; last updated 8 September 2015)26 was also consulted.

Supplemental material

Genetic variants were included in this reference list (online supplementary file 2) if they were predicted to alter the amino acid sequence of a protein (ie, non-synonymous/missense, frameshift and non-frameshift insertions/deletions, splicing variants) and were named in the main text of the publication (ie, variants included in an aggregate of variants used for burden testing were not added to the list). Details of genomic location (hg19), the transcript accession number and cDNA and protein changes were recorded for each variant. For variants where these details were not reported in the original publication(s), they were determined using the University of California Santa Cruz Variant Annotation Integrator (http://genome.ucsc.edu/cgi-bin/hgVai) and the associated Human Genome Variation Society variant nomenclature track. Ancillary details pertaining to the ancestry and ALS inheritance mode for cases carrying each variant were also recorded, as was the presence of each variant in unaffected and unrelated control individuals, where available in the original publication(s).

Identifying ALS-implicated variants in SALS cases

WGS data from 616 SALS cases were parsed for 31 ALS genes (online supplementary table 1) using custom UNIX scripts. Filtering using R was subsequently used to identify the previously reported ALS-implicated variants (online supplementary file 2) present within this dataset. R scripts were also used to identify cases that carried each variant and, conversely, which variants were carried by each case. Unless otherwise specified, all minor allele frequency (MAF) values used for comparisons were from the ethnically matched non-neuro (ie, excluding individuals with neurological disorders) non-Finnish European (NFE) subset of the Genome Aggregation Database (gnomAD; n=51 592).27

C9orf72 GGGGCC and ATXN2 CAG repeat genotyping was performed on WGS BAM files using ExpansionHunter28 29 with default settings. These genotypes were interrogated and combined with the VCF data obtained above using R. Validation was performed for cases with expanded (>30 repeats) or intermediate (23–30 repeats) C9orf72 repeat lengths using repeat primed PCR,30 and those with intermediate ATXN2 repeat lengths (29–39 repeats) using conventional PCR.31 PCR products were analysed by fragment length analysis using an ABI 3730XL DNA Analyser (Applied Biosystems), and size analysis was performed using GeneMarker (V.3.0.1) software (SoftGenetics, Pennsylvania, USA).

Association analysis

To determine whether any of the reported ALS-implicated variants were associated with SALS in Australia, allele counts were compared between SALS cases (n=616) and control individuals from the non-neuro NFE subset of gnomAD (n=51 592)27 using Fisher’s exact test in R. A Bonferroni corrected significance threshold of p<5.875×10−5 was applied to account for the 851 nucleotide level variants tested for association with disease. Control genotyping data for repeat expansions in C9orf72 and ATXN2 were not available, therefore association testing was not completed for these variants. In cases where an ALS-implicated variant was not reported in gnomAD, allele counts were determined by inferring the number of gnomAD controls covered at the site of interest, based on the number of individuals with genotypes for variants located within 40 bp. Data from SALS cases (n=4366) and control individuals (n=1832) from Project MiNE32 were used for validation purposes.

Identity-by-descent analysis in cases carrying identical ALS-implicated variants

Identity-by-descent (IBD) analysis was performed to investigate whether SALS cases that carried an identical ALS-implicated variant had inherited the variant from a recent common ancestor. Relatedness analysis was performed on WGS data using TRIBES with default parameter settings.33 Briefly, TRIBES filtered the WGS data according to quality control metrics, phased the resulting genotype calls and performed pairwise IBD inference. IBD segments were then examined to determine if SALS cases that carried an identical ALS-implicated variant were inferred IBD over that locus. Graphical networks were then produced using isoRelate34 to show putatively related cases that carried identical variants inherited from a common ancestor.

Results

Phenotypic variability of Australian SALS was consistent with that of European cohorts

The clinical dataset of 567 SALS cases comprised 61% males and 71% spinal onset cases. The average age of onset was 60 years (SD=12 years, Q1=54 years, median=62 years, Q3=69 years), and the average disease duration was 38 months (SD=19 months, Q1=26 months, median=35 months, Q3=47 months), at which time 62% of cases were deceased.

Statistical analyses were performed to identify any association between clinical variables including sex, age at disease onset, site of disease onset and disease duration (figure 1). A significant association was observed between sex and site of onset (p=0.00015, figure 1A), where females were more likely to develop bulbar onset while more males presented with spinal onset. Females also had a significantly shorter life expectancy than males (p=0.0079, figure 1C). While females tended to have a later age of onset (figure 1B) this result was not statistically significant. There was also evidence of a relationship between age of onset and disease duration (p=0.00092, figure 1D), with shorter disease duration in cases with a later onset. Cases with bulbar onset were more likely to have a later age of onset (p=0.00031, figure 1E) and reduced life expectancy (p<0.0001, figure 1F).

Figure 1

Statistical analyses of clinical variables for 573 sporadic amyotrophic lateral sclerosis (SALS) cases. P values highlighted in red are significant at the Bonferroni-adjusted significance level p=0.008. There was a significant difference in site of onset between males and females (A), where females presented with bulbar onset more often. Females had a later age of onset (B) and a significantly shorter life expectancy (C). There was a significant association between age of onset and disease duration (D) and cases with bulbar onset had a later age of onset (E) as well as a shorter life expectancy (F).

More than 30% of Australian SALS cases carried a genetic variant previously implicated in ALS

We generated a comprehensive list of 853 genetic variants from 31 genes that were previously implicated in ALS by peer-reviewed publications between 1993 and February 2019 (online supplementary file 2). This included repeat expansions in C9orf72 and ATXN2, as well as 851 single nucleotide and insertion/deletion variants in 29 additional genes. Of the 853 variants implicated in ALS, 46 variants (from 20 genes) were identified across the cohort of 616 Australian SALS cases (table 1). This included three common population-based (MAF >0.2) intronic SNPs, ELP3 rs2614046 and rs6985069, and GPX3-TNIP1 rs10463311. Given their high frequency in both cases and controls, independent of their potential contribution to disease, they were not considered in any further analyses. After removal of these common SNPs, 43 ALS-implicated variants from 18 different genes were present among 218/616 (35.39%) SALS cases, the majority of which were heterozygous. This included 41 cases that carried a C9orf72 hexanucleotide repeat expansion (15 of which had been previously identified using repeat-primed PCR7 14) and 10 cases that harboured intermediate-sized repeat expansions in ATXN2. Three additional cases carried intermediate-sized hexanucleotide repeat expansions in C9orf72, although were not considered C9orf72 expansion positive. The FALS-linked SOD1 p.I114T and the SOD1 p.D91A mutations were identified in three and one Australian SALS case(s), respectively. Missense TARDBP mutations p.G287S, p.G295S and p.A382T were each identified in a single SALS case and TARDBP p.I383V was present in two SALS cases. The remaining ALS-implicated variants identified in Australian SALS cases had more limited evidence to support their role in ALS and were observed in gnomAD non-neuro NFE controls. The relatively common variants (MAF >0.01), CCNF p.V714M, C21orf2 p.V58L, OPTN p.M98K and FUS c.833–29C>T, were each observed as both heterozygous and homozygous variants, and SPG11 p.S164L was exclusively observed in a homozygous state.

Table 1

ALS-implicated variants identified among 616 Australian SALS cases

NEK1 and SOD1 mutations showed significant association with Australian SALS

Fisher’s exact test determined that two variants, NEK1 p.S1036X (p=1.020×10-5) and SOD1 p.I114T (p=9.871×10-6) were significantly over-represented in Australian SALS cases compared with gnomAD non-neuro NFE controls. A further 14 variants were over-represented among cases with nominally significant p values (5.875×10–5<p<0.05) (table 1, online supplementary table 2). NEK1 p.S1036X showed nominal association with disease (p=6.47×10-4) in the Project MiNE case-control cohort, although did not reach corrected significance (p<5.875×10-5, table 1, online supplementary table 2).

Some Australian SALS cases who carried identical ALS-implicated variants may be distantly related

Of the 23 ALS-implicated variants that were identified in multiple SALS cases (table 1), IBD segments and an estimated degree of relatedness were determined between 18 pairs of cases over 8 different ALS-implicated variants (online supplementary table 3, online supplementary figure 1). Other than a pair of SOD1 p.I114T positive cases, these IBD segments ranged between 3 and 12 cM, corresponding to degree of relatedness between 8th and 11th degree. Notably, three trios shared IBD segments, one with each of SQSTM1 p.K238E (population MAF=0.003558, average IBD segment 9.08 cM), CCNF p.V714M (population MAF=0.01469, average IBD segment 4.57 cM) and DCTN1 p.T1249I (population MAF=0.004954, average IBD segment 3.49 cM). No IBD segments >3 cM were detected between the 41 SALS cases with a C9orf72 hexanucleotide repeat expansion, nor 10 cases with intermediate-sized ATXN2 expansions, or the two SALS cases who carried TARDBP p.I383V. As part of an extended IBD analysis, the three cases who carried SOD1 p.I114T were linked as sixth degree relatives to existing Australian FALS families, therefore re-classifying these apparently sporadic cases as misclassified FALS.35

Supplemental material

Potential oligogenic/polygenic basis of ALS in Australian SALS

A total of 42/616 (6.82%) Australian SALS cases were found to harbour more than one reported ALS-implicated variant (table 2) (after excluding the three common variants found in this cohort). Of these 42 individuals, 38 carried two ALS-implicated variants and 4 carried three variants. The cases who carried additional ALS-implicated variants together with C9orf72 (n=17) or SOD1 (n=3) mutations may represent oligogenic inheritance. Notably, all three individuals who carried the known FALS-linked SOD1 p.I114T mutation carried at least one other ALS-implicated variant. Potential polygenic inheritance is implicated for all other cases who carried multiple disease-associated variants (with population MAF values between 0.000022 and 0.02815). SQSTM1 p.E274D and OPTN p.M98K were identified in 11 and 10 cases, respectively, including in an oligogenic state in 5 cases carrying a C9orf72 expansion.

Table 2

Summary of SALS cases who carried multiple ALS-implicated variants

Significant association between putative polygenic/oligogenic SALS cases and the age at disease onset

Multinomial logistic regression analysis showed that cases with a younger age of onset were more likely to carry multiple ALS-implicated variants (OR 0.964, p=0.025; online supplementary table 4). No other clinical variables were significantly associated with polygenic/oligogenic variation.

Discussion

We sought to characterise the phenotypic and genetic heterogeneity of sporadic ALS by surveying clinical characteristics and the presence of sequence variants previously reported to be pathogenic, or associated with ALS, among a large cohort of Australian SALS cases. We also sought to determine whether cases harbouring multiple ALS-implicated variants exhibited more severe clinical characteristics. We demonstrated a high degree of genetic heterogeneity among Australian SALS, with 43 different variants from 18 ALS genes identified among 35.39% of cases. We showed that 6.82% of Australian SALS potentially have a polygenic or oligogenic underpinning to disease, and that patient clinical phenotype was likely influenced by the presence of more than one ALS-implicated variant. Other significant associations that were found between clinical variables in our Australian cohort were consistent with reported findings and confirm the phenotypic heterogeneity of SALS.36 37 This work highlights the genetic contribution and heterogeneity of SALS and suggests that polygenic and/or oligogenic mechanisms may be at play.

Our literature search revealed that 853 genetic variants have been implicated in ALS over the past 26 years. Of these, 43 variants from 18 genes were identified among 35.39% of Australian SALS. As these sporadic cases have no known family history of disease, this high prevalence of reported ALS-implicated variants may reflect unrecognised false positive variants in the literature. Indeed, the evidence supporting the pathogenicity of each ALS-implicated variant varies significantly. While many have strong support as pathogenic ALS mutations through genetic linkage studies and/or extensive segregation within families, others have less compelling evidence to support their role in ALS pathogenesis. For example, variants identified in single FALS or SALS cases, or very small ALS families, have been reported as pathogenic on the basis that they fall within an established ALS gene despite a lack of supporting segregation data. In addition, candidate variants identified prior to the widespread adoption of high-throughput sequencing technologies were typically screened through <100 control individuals. Now that large-scale control databases are available with next-generation sequencing data, it has become apparent that some reported putative ALS mutations are instead rare population-based variants, such as SOD1 p.N20S (online supplementary file 2). Beyond the variants analysed in the present study, it is important to note that unreported rare and novel genetic variants in the known ALS genes may also contribute to the development and phenotypic presentation of SALS.

Twelve ALS genes were found to harbour more than one unique variant in our cohort. Most notably, this included five variants in the known ALS gene, TARDBP. Four of these TARDBP missense mutations (p.G287S, p.G295S, p.A382T and p.I383V) have also been reported in additional SALS cohorts of various origins (online supplementary file 2) and are extremely rare in the healthy population (MAF <0.00002233), observations that support the pathogenic role of these variants. It is likely that these variants are low penetrance mutations, given their absence from FALS cohorts as established by the literature search conducted here. Alternatively, these may be de novo variants within mutation hot spots. Another possibility is they may be risk alleles, which is supported by our association analyses where all four TARDBP missense mutations showed nominal association with SALS (table 1). IBD analysis failed to detect a relationship between two Australian SALS cases who carried an identical TARDBP mutation (p.I383V) suggesting that it is either an old mutation or arose independently more recently.

An interesting identified variant was the homozygous SPG11 p.S164L. The original report of this rare variant also identified a homozygous SALS case,38 although it was suggested this variant may be a benign polymorphism.38 It is interesting to note that no individuals in the gnomAD non-neuro control cohort (of any ancestry) are homozygous carriers of this variant. It is possible that homozygous SPG11 p.S164L confers increased susceptibility to ALS through loss of-function.

Association testing identified just two variants that were significantly over-represented among Australian SALS cases when compared with controls. One of these was the FALS causal SOD1 p.I114T variant, although the putative SALS cases that carried this variant were subsequently found to be unrecognised FALS cases.35 The other over-represented variant, NEK1 p.S1036X, was previously reported in a single FALS case, although segregation was not demonstrated in the family.39 Together with our data, this suggests that NEK1 p.S1036X is an ALS susceptibility allele of moderate-to-high effect, with implications for other NEK1 loss-of-function variants.

We sought to determine whether SALS cases that shared the same rare ALS-implicated variants (MAF <0.03 in the general population) were distantly related and potentially represented unrecognised FALS cases. IBD analysis demonstrated that the trio of SALS cases that carried SQSTM1 p.K238E were likely to be distantly related, with shared IBD segments of at least 7.892 cM over SQSTM1 suggesting an estimated ninth degree of relatedness. Indeed, one pair within this trio shared an 11.442 cM IBD segment with relatedness estimated at the eighth degree. This relatedness suggests SQSTM1 p.K238E is a low penetrance FALS mutation. A sixth degree relationship was identified between two SOD1 p.I114T mutation carriers (online supplementary table 3; reported by Henden et al 35). The remaining putative relationships identified were at the eighth degree or more distant, with IBD segments between 3 cM and 8 cM. Given that TRIBES relationship estimates greater than seventh degree are only 13% accurate,33 35 and the shared ALS-implicated variants occur in the general population, it is less likely that these were related individuals. For the 41 SALS cases that carried a C9orf72 repeat expansion and 10 SALS cases with an ATXN2 intermediate expansion, no shared IBD segments >3 cM were identified.

An oligogenic and/or polygenic basis of ALS has been suggested by previous studies where multiple ALS gene variants have been identified in individual FALS cases11–14 and SALS cases,12 37 40 whereby variants may act together to cause ALS, or influence clinical manifestation. Our analysis showed 6.82% of SALS carried multiple ALS-implicated variants, further supporting an oligogenic/polygenic basis to ALS. In most reports of oligogenic ALS cases, one reported variant has been the pathogenic expansion of C9orf72.11–14 37 40 Seventeen of the 42 (40.48%) putative oligogenic cases in our cohort had a C9orf72 expansion. Over 40% of C9orf72-positive SALS cases in our cohort (n=41) had putative oligogenic disease. As described above, many of the ALS-implicated variants investigated here, and identified in a polygenic/oligogenic state, have limited evidence supporting their pathogenicity. Similarly, many variants identified in an oligogenic/polygenic state as part of other studies have limited evidence to support a pathogenic role. Functional studies will be necessary to determine whether particular variants act together to cause or modify the presentation of ALS.

It has been hypothesised that the development of ALS is a multistep process.16 Oligogenic and polygenic factors may underlie steps in the development of ALS and explain the highly variable phenotypic presentation and course of disease, including that seen between cases who carry identical causal mutations. As such, we sought to determine whether cases who harboured multiple ALS-implicated variants exhibited more severe clinical characteristics. We showed that patients with more than one ALS-implicated variant were significantly more likely to develop disease earlier in life than those with no known ALS-implicated variant. This is consistent with the concept that multiple ‘hits’ are required to trigger disease onset. It will be of interest for future studies to assess whether clinical features associate with specific mutant genes or with a specific variant or combination of variants. This will require large cohorts of SALS cases that carry mutations in the same ALS gene or identical ALS gene mutations.

Polygenic risk scores have been proposed as a tool to stratify patients with ALS, particularly SALS. Polygenic risk scores estimate an individual’s risk of developing disease, based on the number of risk variants they carry and the relative disease risk each imposes. Both the phenotypic and genetic heterogeneity of ALS complicate the design of clinical trials, and in turn, the application of therapeutics to the spectrum of ALS cases.2 4 Cohort stratification using polygenic risk scores has the potential to maximise the efficacy of therapeutic trials. While we were unable to calculate polygenic risk scores here due to insufficient sample size, our data, particularly our demonstration that patients carrying multiple ALS-implicated variants are more likely to have earlier disease onset, suggest that polygenic risk scores have potential to be useful in clinical settings and clinical trial design.

We used ExpansionHunter to directly parse NGS data for expansions in both C9orf72 and ATXN2. The accuracy of ExpansionHunter for sizing C9orf72 hexanucleotide expansions was originally reported at over 99.9%,28 which was also reflected by our own validation data. Nevertheless, there remains the possibility of false negative cases. Southern blot analysis was not feasible for precise C9orf72 expansion sizing, due to a lack of sufficient DNA from most cases, typical of historical sample cohorts.

In conclusion, we have explored the heterogeneity of SALS using a comprehensive survey of implicated ALS variants. Our data support a genetic predisposition to SALS with the presence of multiple ALS-implicated variants associated with earlier disease onset. Future studies are likely to continue to uncover novel genetic susceptibility and modifier variants that influence the development, presentation and course of SALS. It is hoped that the phenotypic and genetic characterisation of SALS will aid the design of biomarker studies and therapeutic trials.

Acknowledgments

The authors are grateful for the participation of patients. The authors would like to thank Lorel Adams and Elizabeth Highton-Williamson for their assistance in compiling clinical information and Elisa Cachia and Dr Sarah Furlong for providing patient materials and technical assistance. The authors would also like to thank the Project MinE GWAS Consortium.

References

Footnotes

  • EPM, LH and JAF are joint first authors.

  • KLW and IPB are joint senior authors.

  • Twitter @emily__mccann, @Dr_KLWilliams

  • Contributors Study concept and design: EPM, JAF, KLW with input from LH and IPB. Acquisition of data: major contribution from EPM, LH, JAF and contributions from KYZ, NG, DCB and SCMF. Analysis (including statistical) and interpretation of the data: EPM, LH, JAF and KLW. Data processing: NAT. Collection of clinical information and samples: RP, MCK and DBR. Study supervision: KLW and IPB. Manuscript preparation: EPM and LH. Critical revision of the manuscript: EPM, LH, JAF, KLW and IPB. All authors read and approved the final manuscript.

  • Funding This work was funded by the Motor Neuron Disease Research Institute of Australia (Grant-in-Aid to KLW, Bill Gole Postdoctoral Fellowship to JAF and PhD top-up scholarship to EPM), National Health and Medical Research Council of Australia (grants 1095215, 1107644 to IPB and fellowship 1092023 to KLW) and Macquarie University (grant to KLW). MCK was supported by the National Health and Medical Research Council of Australia Programme Grant (1132524), Partnership Project (1153439) and Practitioner Fellowship (1156093).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available on reasonable request. The datasets generated and/or analysed during the current study are not publicly available as our ethics permission does not cover sharing of data to third parties, however may be made available on reasonable request to the corresponding author (IPB).

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.