Article Text

Download PDFPDF

Short report
Gene discoveries in autism are biased towards comorbidity with intellectual disability
  1. Matthew Jensen,
  2. Corrine Smolen,
  3. Santhosh Girirajan
  1. Biochemistry and Molecular Biology, Pennsylvania State University, University Park, Pennsylvania, USA
  1. Correspondence to Dr Santhosh Girirajan, Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; sxg47{at}psu.edu

Abstract

Background Autism typically presents with highly heterogeneous features, including frequent comorbidity with intellectual disability (ID). The overlap between these phenotypes has confounded the diagnosis and discovery of genetic factors associated with autism. We analysed pathogenic de novo genetic variants in individuals with autism who had either ID or normal cognitive function to determine whether genes associated with autism also contribute towards ID comorbidity.

Methods We analysed 2290 individuals from the Simons Simplex Collection for de novo likely gene-disruptive (LGD) variants and copy-number variants (CNVs), and determined their relevance towards IQ and Social Responsiveness Scale (SRS) measures.

Results Individuals who carried de novo variants in a set of 173 autism-associated genes showed an average 12.8-point decrease in IQ scores (p=5.49×10−6) and 2.8-point increase in SRS scores (p=0.013) compared with individuals without such variants. Furthermore, individuals with high-functioning autism (IQ >100) had lower frequencies of de novo LGD variants (42 of 397 vs 86 of 562, p=0.021) and CNVs (9 of 397 vs 24 of 562, p=0.065) compared with individuals who manifested both autism and ID (IQ <70). Pathogenic variants disrupting autism-associated genes conferred a 4.85-fold increased risk (p=0.011) for comorbid ID, while de novo variants observed in individuals with high-functioning autism disrupted genes with little functional relevance towards neurodevelopment.

Conclusions Pathogenic de novo variants disrupting autism-associated genes contribute towards autism and ID comorbidity, while other genetic factors are likely to be causal for high-functioning autism.

  • autism
  • intellectual disability
  • comorbid features
  • genetic complexity

Statistics from Altmetric.com

Introduction

Autism spectrum disorder, which presents in children with social communication difficulties, repetitive behaviour and restricted interests,1 is a highly heterogeneous neurodevelopmental disorder characterised by complex genetic aetiology and strong comorbidity with other developmental disorders.2 For example, approximately 30% of individuals with autism also manifest with intellectual disability (ID),3 defined by IQ scores <70.1 The high degree of co-occurrence of autism with ID has been shown to confound accurate diagnosis of autism. In fact, we recently showed that 69% of individuals diagnosed with ID are likely to be recategorised and diagnosed with autism.4 The diagnostic overlap between autism and ID suggests that de novo gene-disruptive variants and copy-number variants (CNVs) identified in individuals ascertained for autism in large-scale studies could also be confounded by ID comorbidity. Here, using genetic and phenotypic data from 2290 individuals with autism from the Simons Simplex Collection (SSC),5 we show that gene discoveries in autism are biased towards genes that contribute towards both autism and comorbid ID.

Materials and methods

We analysed rare de novo likely gene-disruptive (LGD) variants derived from exome sequencing data,6 7 78 disease-associated CNVs8 derived from microarray data,9 and Full Scale IQ and Social Responsiveness Scale (SRS) T-scores for SSC probands obtained from the Simons Foundation Autism Research Initiative (SFARI).5 We identified 173 genes associated with autism (online supplementary file 1) from multiple database sources, including tier 1 genes (>2 de novo LGD variants) from the Developing Brain Disorders Gene Database,10 genes with >5 non-SSC de novo LGD variants from denovo-db,11 and SFARI Gene tiers 1 and 2 (https://gene.sfari.org/). Clinical case reports were reviewed for a subset of 22 genes that appeared in all three databases (online supplementary file 2). Expected frequencies of de novo variants were calculated from gene-specific probabilities of de novo nonsense and frameshift variants based on a sequence context-dependent model.12 Phenotypic data for mouse knockout models were obtained from the Mouse Genome Informatics database.13 Gene-set enrichment for specific expression in brain regions during development, based on expression data derived from the BrainSpan Atlas,14 was calculated using the Specific Expression Analysis online tool.15 All statistics were calculated using R V.3.4.2 (R Foundation for Statistical Computing, Vienna, Austria).

Supplemental material

Supplemental material

Results

We first compared the phenotypes of 288 individuals with de novo LGD variants and 81 individuals with pathogenic CNVs with 1921 individuals without such variants obtained from the SSC cohort. Similar to previous autism studies that identified correlations between de novo variants and IQ scores,12 16–18 we found that individuals with de novo LGD variants (average IQ=77.7, p=0.031, two-tailed Mann-Whitney test) or pathogenic CNVs (average IQ=76.3, p=0.002) had a significant decrease in IQ scores compared with individuals without such variants (average IQ=82.3) (figure 1A). However, no differences in autism severity, measured using SRS T-scores, were observed between groups of individuals with and without pathogenic variants (p=0.104 for de novo LGD variants and p=0.963 for CNVs) (figure 1A). This suggests that pathogenic variants in general contribute to ID independent of autism severity, although this could also be due to an ascertainment bias in the SSC cohort towards individuals with severe autism. We further identified individuals who carried de novo LGD variants in 173 autism-associated genes, defined as genes with recurrent de novo variants reported in multiple databases of sequencing studies (online supplementary file 1). We found that 74 individuals carrying de novo LGD variants in autism-associated genes had decreased IQ (average IQ=69.1, p=5.49×10−6, two-tailed Mann-Whitney test) and increased SRS T-scores (average SRS=82.4, p=0.013) compared with 2216 individuals without de novo LGD variants in these genes (average IQ=81.9, average SRS=79.6), implying that candidate autism genes contribute to both autism and ID phenotypes (figure 1B). To validate this finding, we examined 76 published case reports of affected individuals with pathogenic variants in a subset of 22 autism-associated genes for ID comorbidity (table 1 and online supplementary file 2). For example, recent case studies have identified autism co-occurring with ID in 21 individuals with de novo SHANK3 variants,19 19 individuals with NRXN1 variants20 and 18 individuals with TCF20 variants.21 Overall, 460 of 497 (92.6%) individuals with autism described in these studies had ID, emphasising that variants in these genes contribute to severe forms of autism with comorbid ID (table 1). The remaining 37 individuals (7.4%) who manifested autism but not ID each carried variants in genes primarily contributing towards autism and comorbid ID, in particular CHD8 and NRXN1, suggesting that these genes exhibit incomplete penetrance or allelic heterogeneity towards ID phenotypes.

Figure 1

Phenotypic comparison of individuals with autism from the SSC cohort with and without pathogenic variants. (A) Individuals with pathogenic variants (de novo LGD and CNV) had significantly lower IQ scores than individuals without pathogenic variants, but no change in autism severity (SRS T-score) was observed between the three groups. (B) Individuals with de novo LGD variants in candidate autism genes had lower IQ scores and more severe autism phenotypes (SRS T-score) than individuals without such variants. ‘n’ indicates sample size, p values were derived from two-tailed Mann-Whitney tests, and dotted lines within each violin plot indicate the median and first and third quartiles. CNV, copy-number variant; LGD, likely gene-disruptive; SRS, Social Responsiveness Scale; SSC, Simons Simplex Collection.

Table 1

Individuals carrying variants in autism-associated genes with comorbid intellectual disability (ID)

We next compared genetic data from 397 SSC individuals (17.3% of the SSC cohort) with ‘high-functioning autism’, defined as having severe autism (SRS >75) and average or above-average IQ scores (IQ >100), with 562 individuals (24.5% of the SSC cohort) with both autism and ID (SRS >75 and IQ <70). Individuals with high-functioning autism had a significantly lower (p=0.021, one-tailed Fisher’s exact test) frequency of de novo LGD variants (42 of 397, 10.6%) than individuals with autism and ID (86 of 562, 15.3%). Similarly, individuals with high-functioning autism were less likely (p=0.065) to carry pathogenic CNVs (9 of 397, 2.3%) than individuals with both autism and ID (24 of 562, 4.3%). In fact, de novo LGD variants conferred a 1.53-fold higher likelihood of manifesting ID among individuals with autism (p=0.035, 95% CI 1.03 to 2.26), and pathogenic CNVs similarly conferred a 1.92-fold increased risk for co-occurrence of ID among individuals with autism (p=0.099, 95% CI 0.88 to 4.18). We replicated these observations by analysing an additional combined cohort of 2357 individuals from both the SSC and the Autism Sequencing Collection.22 Here, individuals with both autism and ID had a significantly higher rate (p=3.04×10−6, one-tailed Student’s t-test) of de novo variants in genes intolerant to variation, as measured by the probability of loss-of-function intolerant (pLI) scores >0.9 (70 of 643, 10.8%), than individuals manifesting autism but not ID (114 of 1747, 6.65%). We also found that only 3 of 397 (0.8%) individuals in the SSC cohort with high-functioning autism carried de novo LGD variants in autism-associated genes, including ANK2, HIVEP3 and BAZ2B. This frequency was not significantly different from the expected frequency of de novo variants in the general population (p=0.095, one-tailed Student’s t-test). In contrast, 20 of 562 (3.6%) individuals with both autism and ID carried de novo LGD variants in autism-associated genes, such as CHD8, SCN2A and SYNGAP1, representing a 19.2-fold enrichment of de novo variants compared with the expected rate in the general population (p=9.48×10−6). Thus, de novo LGD variants in autism genes conferred a 4.85-fold increased risk (p=0.011, 95% CI 1.43 to 16.42) towards comorbid ID in individuals with autism.

We further sought to determine the biological relevance of the 42 genes with de novo LGD variants identified in individuals with high-functioning autism, and found that these genes in aggregate had less functional relevance towards neurodevelopment than the reported autism-associated genes. For example, genes with de novo LGD variants in individuals with high-functioning autism were less resistant to genetic variation than reported autism-associated genes, as measured by the average Residual Variation Intolerance Score (RVIS) (average score 0.413 vs 0.185, p=4.00×10−4, Mann-Whitney two-tailed test) and pLI percentile (average score 0.498 vs 0.179, p=9.77×10−7) gene metrics23 24 (figure 2A). In fact, while the RVIS and pLI percentiles of the reported autism genes were clustered below the thresholds for pathogenicity (RVIS <20th percentile and pLI <18th percentile, or raw score >0.9), genes disrupted in individuals with high-functioning autism were evenly distributed across the range of percentiles. Additionally, while autism genes were enriched for specific expression in the cortex (p=3.13×10−4, Fisher’s exact test with Benjamini-Hochberg correction) and cerebellum (p=0.020) during early fetal development,15 genes with de novo LGD variants in high-functioning autism individuals were not enriched for any specific expression patterns in the developing brain (figure 2B). Furthermore, mouse models of genes identified in individuals with high-functioning autism were significantly less likely to manifest nervous system (12 of 42 genes, 28.6%, p=4.90×10−3, one-tailed Fisher’s exact test with Benjamini-Hochberg correction) and behavioural/neurological (10 of 42 genes, 23.8%, p=0.037) phenotypes than mouse models of reported autism-associated genes (behaviour/neurological: 93 of 173 genes, 53.8%; nervous system: 96 of 173 genes, 55.5%) (figure 2C). These findings suggest that genes with de novo LGD variants in individuals with high-functioning autism are less pathogenic in humans and model organisms, and therefore may not necessarily contribute towards the specific high-functioning autism phenotype.

Figure 2

Functional analysis of genes with de novo LGD variants in individuals with high-functioning autism. (A) Genes with de novo LGD variants in individuals with high-functioning autism (SRS >75 and IQ >100) had lower average RVIS (left) and pLI (right) percentile scores than those for reported autism-associated genes. Thick dotted lines across the violin plots indicate thresholds for gene pathogenicity: <20th percentile for RVIS and <18th percentile for pLI (>0.9 raw score). Thin lines within the violin plots indicate the median and first and third quartiles. P values were derived from two-tailed Mann-Whitney tests. (B) Expression of genes with de novo variants in individuals with high-functioning autism and autism-associated genes in the developing human brain. Autism-associated genes were enriched (p<0.1, Fisher’s exact test with Benjamini-Hochberg correction) for specific expression in the cortex and cerebellum during early development, while no enrichment was seen in the genes identified in individuals with high-functioning autism. Hexagon sizes represent the number of genes preferentially expressed in each brain tissue and timepoint, while the colours of the hexagons represent p values for the enrichment of autism genes among each set of preferentially expressed genes. (C) Frequency of phenotypes observed in mouse knockout models for genes with de novo LGD variants in individuals with high-functioning autism compared with reported autism-associated genes. *P<0.05 (one-tailed Fisher’s exact test with Benjamini-Hochberg correction). LGD, likely gene-disruptive; pLI, probability of loss-of-function intolerant score; RVIS, Residual Variation Intolerance Score; SRS, Social Responsiveness Scale.

Discussion

Our results suggest that pathogenic variants such as de novo LGD variants and CNVs contribute to autism phenotypes primarily in individuals with comorbid ID, especially if these variants disrupt a gene previously associated with autism. Several themes regarding the study of high-functioning autism have emerged from these findings. First, the consistently high degree of comorbidity between autism and ID has led to an ascertainment bias towards individuals who manifest both disorders in large-scale sequencing cohorts, as it is difficult to exclude all individuals with comorbid disorders and still have adequate power to identify recurrent variants. Indeed, more than 80% of the SSC cohort had an IQ score less than 100, and the average IQ of the cohort (81.5) was 18.5 points below the population average. This bias has contributed towards the identification of genes and CNV regions related to both autism and ID, as evidenced by decreased IQ among carriers of de novo variants in these genes as well as a high incidence of comorbid phenotypes reported in published case studies. Large-scale sequencing studies still hold a high value in uncovering shared biological mechanisms that could underlie both disorders.25 However, understanding the biology of the core autism phenotypes would require concerted efforts to recruit individuals who specifically manifest high-functioning autism without ID.

Second, individuals with high-functioning autism are less likely to carry de novo LGD variants in candidate autism genes, as each of these candidate genes was primarily associated with autism and comorbid ID. Instead, de novo variants in individuals with high-functioning autism tend to disrupt genes with less functional relevance towards neurodevelopment. These genes likely carry non-recurrent de novo LGD variants that either confer a small effect size towards autism risk on their own or are not associated at all with neurodevelopment. We therefore propose that multiple genomic factors with varying effect sizes, such as missense variants, common variants, variants in regulatory and non-coding regions, or the combinatorial effects of inherited variants, contribute towards autism phenotypes without ID. For example, Schaaf and colleagues26 performed targeted sequencing of 21 candidate autism genes in 339 individuals with high-functioning autism. They found that 2% of individuals carried de novo missense variants in candidate autism genes, such as PTEN and FOXP2, suggesting that allelic variants of differing severity within the same gene might contribute to distinct neurodevelopmental trajectories. Interestingly, the same study also found that 7% of individuals with high-functioning autism carried multiple inherited missense variants in candidate autism genes, potentially contributing to an oligogenic model for high-functioning autism phenotypes. Similarly, common variants have been found to contribute towards increased autism risk in individuals without ID.27 28 For example, Grove and colleagues28 recently reported that the heritability attributed to common variants, including those primarily associated with cognitive ability and educational attainment, was three times lower in individuals with autism and ID compared with those without ID. Finally, variants that may not contribute directly towards autism phenotypes themselves, including de novo LGD variants observed in individuals with high-functioning autism, could still be responsible for subtler modification of the severity of autism or ID phenotypes.

Overall, our results emphasise the importance of dissecting phenotypic heterogeneity in family-based sequencing studies of complex diseases, especially those with a high frequency of comorbid disorders. While a larger cohort of individuals recruited specifically for high-functioning autism could identify associations with recurrent genes or different types of variants, these findings should be validated using functional studies to more fully differentiate the genetic causes for high-functioning autism from those for autism with comorbid ID.

Acknowledgments

The authors thank Fereydoun Hormozdiari (UC Davis), Lucilla Pizzo (Penn State) and Vijay Kumar (Penn State) for their helpful discussions and comments on the manuscript. The authors are grateful to all of the families at the participating Simons Simplex Collection (SSC) sites, as well as the principal investigators (A Beaudet, R Bernier, J Constantino, E Cook, E Fombonne, D Geschwind, R Goin-Kochel, E Hanson, D Grice, A Klin, D Ledbetter, C Lord, C Martin, D Martin, R Maxim, J Miles, O Ousley, K Pelphrey, B Peterson, J Piggot, C Saulnier, M State, W Stone, J Sutcliffe, C Walsh, Z Warren, E Wijsman). The authors appreciate obtaining access to phenotypic data on the Simons Foundation Autism Research Initiative (SFARI) Base.

References

Footnotes

  • Twitter @girirajan16

  • Contributors MJ and SG conceptualised the study, and MJ and CS analysed the data. MJ and SG wrote the manuscript with input from all authors.

  • Funding This work was supported by NIH R01-GM121907, SFARI Pilot Grant (#399894) and resources from the Huck Institutes of the Life Sciences to SG, and NIH T32-GM102057 to MJ.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval As the data analysed in the study were de-identified, the study was exempt from IRB review. The study conformed to the Helsinki Declaration.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Approved researchers can obtain the SSC data sets described in this study by applying at https://www.base.sfari.org.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.