Introduction

Autism is a neuropsychiatric disorder characterized by impaired social interactions and communication, as well as restricted, repetitive and/or stereotyped behaviors. The term autism spectrum disorders (ASDs) is now commonly used to designate this group of highly heterogeneous and complex developmental diseases. Autism is associated with intellectual disability (ID) in 75% in the patients and with epilepsy or EEG abnormalities in 15–25%. ASDs are more common in male patients than in female patients, in a ratio of 4:1 that can reach 10:1 in forms with normal cognitive abilities.1

As indicated by the increased risk of recurrence in families and concordance in twin pairs, ASDs are largely genetically determined.1, 2 Recent studies have demonstrated that ASDs can be caused by rare, highly penetrant point mutations, deletions, duplications and larger chromosomal abnormalities that can either arise de novo or be inherited.3, 4, 5, 6, 7, 8, 9 Known monogenic disorders account for 2–5% of syndromic cases; fragile X syndrome is usually the most common cause, followed by PTEN macrocephaly syndrome and tuberous sclerosis, each accounting for <1% of individuals with ASD.1, 10 Large copy number variants (CNVs) are found in 5–10% of autistic patients, especially in those with syndromic ASDs.5, 11, 12 Several CNVs have been repeatedly identified in individuals with ASD, such as deletions or duplications on chromosomes 15q11–q13, 16p11.2, 7q11.23 and 22q13.4, 5, 13 Private CNVs in individuals with autism often disrupt genes encoding synaptic proteins, such as neuroligins, neurexins and SHANK proteins, or neuronal cell-adhesion proteins, which have important roles in neurodevelopment and/or neurotransmission.3, 14, 15, 16, 17 However, most CNVs and variants in genes involved in autism so far are also found in other neurodevelopmental disorders such as ID without autistic features and/or schizophrenia.12, 18 Recent studies point to polygenic or oligogenic inheritance.6, 17, 19 Indeed, most of the abnormalities identified have been associated with highly variable phenotypes and seem insufficient to cause ASDs on their own. It is probable that genetic interactions (epistasis) between rare variants have an important role in the etiology of ASDs.

In this study, we screened 194 subjects with ASDs using SNP microarrays to assess genomic causes and establish genetic syndrome diagnoses. Three heterozygous CNVs, a de novo triplication of chromosome 15q11–q12 of paternal origin, a distal deletion on chromosome 9p24 and a de novo 3q29 deletion, and one homozygous deletion on chromosome 1p31.1 encompassing PTGER3 were considered to probably cause the disorder in four individuals. In addition, we screened candidate genes located in inherited deletions to unmask autosomal recessive variants in three patients, and found a rare missense variant in DOCK10 associated with an inherited deletion on chromosome 2q in one individual.

Materials and methods

Cohort description

This study first included 200 subjects with ASDs (168 males and 32 females) who were recruited in the ‘Centre de Référence déficiences intellectuelles de causes rares’, the ‘Centre Diagnostic Autisme’, Pitié-Salpêtrière Hospital (Paris, France) or the ‘Fondation Lejeuné’ over a period of 3 years (2009–2011). Index cases were assessed with the Autism Diagnostic Interview-Revised (ADI-R): 167 index cases (83.5%) had autism with ID and 33 (16.5%) had Asperger syndrome or high-functioning autism based on the DSM IV-TR criteria; 85% (170/200) of the patients were sporadic cases. Preliminary systematic screening included karyotyping, screening for fragile X syndrome and targeted metabolic or genetic analyses, including fluorescence in situ hybridization (FISH) analysis of regions involved in Williams–Beuren and Smith–Magenis syndromes and screening of the RAI1 gene for selected patients presenting compatible clinical features. These analyses allowed a clinical diagnosis in six males, one with fragile X syndrome, two with a supernumerary chromosome 15 of maternal origin, one with a 7q11.23 deletion, one with a 17p11.2 deletion and one with a RAI1 mutation. These six individuals were excluded from further analyses. CNV analysis was therefore performed on 194 subjects. Informed written consent was obtained from each individual or his/her parents before blood sampling. All experiments were performed in accordance with French guidelines and legislation.

High-density SNP arrays

Genomic DNA was extracted from blood cells using standard phenol–chloroform or saline procedures. Index cases and the parents of individuals with possibly pathogenic CNVs were genotyped using either 370CNV-Quad (n=20), 660W-Quad (n=27) or cytoSNP-12 (n=147) microarrays (Illumina, San Diego, CA, USA). Automated Illumina microarray experiments were performed according to the manufacturer’s specifications. Image acquisition was performed using an iScan System (Illumina). Image analysis and automated CNV calling was performed using GenomeStudio v.2011.1 and CNVPartition v.3.1.6 with the default confidence threshold of 35. We also used a size threshold of 20 kb for individuals analyzed with 660W-Quad microarrays, as CNVs <20 kb detected by these arrays proved to be false positives. Identified CNVs were systematically compared with those present in the Database of Genomic Variants (DGV), excluding BAC-based studies, using an in-house bioinformatics pipeline, to assess their frequency in control populations. Only CNVs identical (breakpoints and copy number) to or totally included in those described in the DGV were considered; microrearrangements partially overlapping CNVs described in DGV or with a discordant copy number were considered to be novel. CNVs with a minor allele frequency ≥1% in at least one study comprising ≥30 controls were excluded from further analysis. CNVs encompassing coding regions, and with a frequency <1%, were considered as possibly deleterious and were searched for in the parents of the patients when DNA was available (Figure 1). CNV frequencies were compared with the Mann–Whitney or unilateral Fisher’s exact tests. Candidate genes present in the CNVs were compared with those present in AutDB (http://www.mindspec.org/autdb.html).20

Figure 1
figure 1

Strategy used for the selection of rare, possibly deleterious CNVs. The criteria included the presence of coding regions in the CNV, a minor allele frequency (MAF) <1% in the DGV, a de novo occurrence of the CNV, a previous association with ASDs, the recurrence of rare CNVs in our cohort and the presence of genes of interest (ie expressed in the brain and with a function related to that of genes previously involved in ASDs). The CNVs highlighted in this study are indicated in italics for each subcategory.

Analysis of candidate genes by Sanger sequencing

Specific primer pairs (available on request) were designed to screen the coding regions of DOCK10, DOCK8, NYAP2, SMARCA2 and ATP2C2 in a research context. Forward and reverse sequence reactions were performed with the Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA, USA). G50-purified sequence products were run on an ABI 3730 automated sequencer (Applied Biosystems) and data were analyzed with Seqscape v.2.6 software (Applied Biosystems). Missense variants were assessed in silico for possible pathogenicity using PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), SIFT (http://sift.bii.a-star.edu.sg/) and SNPs&GO (http://snps-and-go.biocomp.unibio.it/).

Microsatellite marker analysis

Three microsatellite markers in the 15q11–q13 region (D15S128, D15S122 and D15S822) were used to follow the transmission of the alleles, as already described.13

Fluorescence in situ hybridization

To confirm array results, we performed FISH analyses on peripheral blood lymphocytes with the subtelomere 9p, UBE3A probes (Cytocell, Cambridge, UK) and RP1-196F4 in 3q29, according to the supplier’s recommendations.

Methylation analysis

Whole methylation analysis was performed for the subject with the 15q11–q12 triplication (8082), a patient with Angelman syndrome presenting with autistic features, previously reported,13 and two young male controls by Integragen SA (Evry France). Bisulfite conversion of DNA was carried out to convert unmethylated cytosine nucleotides to uracil. Converted DNA was hybridized on Infinium HumanMethylation450 BeadChips (Illumina), according to the manufacturer’s specifications. Microarrays were scanned using an Illumina BeadArray Reader, and the data were analyzed using GenomeStudio (Illumina).

Results

A total of 130 CNVs encompassing at least one exon and absent or rare (minor allele frequency <1%) in the DGV were found in 94 subjects with ASD (Supplementary Table S1). These CNVs included one triplication, 82 duplications, 46 heterozygous deletions and 1 homozygous deletion. The size of the CNVs ranged from 5 to 4480 kb, with a median at 138 kb. DNA from both parents of 48 patients was available for investigation of the transmission of the CNV. Three CNVs, one triplication of 15q11–q12, one deletion on chromosome 3q29 and one duplication of 16p11.2, were absent from both parents, demonstrating that they occurred de novo in the index case (Table 1 and Supplementary Clinical Data). Seventy-four CNVs (26 deletions and 48 duplications) were inherited from an asymptomatic parent. A homozygous deletion on chromosome 1p31.1, encompassing the PTGER3 gene, was found in one patient; this deletion was heterozygous in both parents, who were both from The Comoro Islands.

Table 1 Summary of CNV of interest reported in this study and main clinical features of the patients

Three CNVs, all confirmed by FISH analysis, were considered to be the probable cause of ASD, based on their large size, absence from healthy individuals, de novo occurrence and/or presence in previously reported syndromes associated with ID and ASD: a de novo triplication of the 15q11–q12 region in a male subject (family 772), a deletion of the 9p24 region in a female (family 15) and a deletion of 3q29 in a male (family 1315) (Figure 2 and Supplementary Figure S3). The de novo 15q11–q12 triplication spanned 4.48 Mb and included the Angelman/Prader–Willi critical region between breakpoints BP2 and BP3. Analysis of microsatellite markers in the critical region revealed that the triplication was of paternal origin and resulted from an interchromosomal recombination since it included two different paternal alleles. FISH analysis indicated that the three paternal copies were located on the same chromosome at the 15q11–q12 locus. Methylation analysis of genes located within the triplicated region confirmed increased methylation values for UBE3A and decreased methylation values for SNRPN. Interestingly, this analysis also revealed abnormal methylation values for non-imprinted genes in the region including OCA2 and GABRG3. The de novo deletion on chromosome 3q29, found in an individual with high-functioning autism, spanned 1.6 Mb and encompassed 22 genes. The deletion on chromosome 9p24 spanned 3.8 Mb and encompassed 13 genes, including SMARCA2, which encodes a protein belonging to the FMRP complex, and DOCK8. Only the DNA of the mother, who did not have the deletion, was available.

Figure 2
figure 2

Identification of three pathogenic CNVs: 15q11q12 triplication (family 772, patient 8082), 9p24 deletion (family 15, patient 8378) and 3q29 deletion (family 1315, patient 433). (a) SNP array profiles of the patient with the triplication (8082): the Y axis indicates the log R (above) and the B allele frequency (below); the X axis indicates the position on chromosome 15. The red line (log R ratio profile) corresponds to the median smoothing series (Genomestudio). (b) Confirmation of the triplication by FISH analysis with a probe specific to UBE3A (in red) on peripheral blood cell metaphases and nucleus. The arrows point the triplication (c) Details of the genes included in the triplication. (d) Microsatellite (D15S822) analysis showing two alleles inherited from the father in the proband. (e) Methylation indexes of genes included in the triplicated region showing abnormal methylation of UBE3A and SNRPN in the patient 8082 and comparison with a patient with Angelman syndrome. (f) SNP array profiles of the patient with the 9p24 deletion: the Y axis indicates the log R (above) and the B allele frequency (below); the X axis indicates the position on chromosome 9. (g) Confirmation of the deletion (shown by the arrow) by FISH analysis with a probe specific to the 9p subtelomeric region (green) on peripheral blood cell metaphases. (h) Details of the genes included in the deletion. (i) SNP array profiles of the patient with the 3q29 deletion: the Y axis indicates the log R (above) and the B allele frequency (below); the X axis indicates the position on chromosome 3. (j) Confirmation of the deletion (shown by the arrow) by FISH analysis with a probe specific to the 3q29 region (RP1-196F4, green) on peripheral blood cell metaphases and nuclei. (k) Details of the genes included in the deletion.

The duplication of the pericentromeric 16p11.2 region, found in four subjects out of 194 (2%), was the most recurrent CNV (Supplementary Figure S4). Parental DNA was available for three patients: the duplication was inherited from the father in two patients (families 845 and 1122) and occurred de novo in one (family 885). Seven CNVs common to at least two patients or five different CNVs altering a common gene were found in 25 additional cases. These CNVs included a duplication of APOO on chromosome X in two patients, a deletion of C2ORF63 in two patients, a deletion and a duplication of CTNNA3 in one patient each, which were all absent from the DGV, and a duplication encompassing the DNAH5 and TRIO genes on chromosome 5, found in two patients. Furthermore, rare CNVs, such or as maternally inherited deletions on chromosome X that alter DMD, ASMT and DDX53 (upstream of PTCHD1), or CNVs encompassing PARK2, RB1CC1, ICA1 and NXPH1 were previously reported in ASD.18, 20

Twenty-nine patients had more than one rare, potentially pathogenic CNV. Two of these individuals, in which inheritance could be assessed, had inherited a CNV from each parent absent from the DGV, suggesting that the combination of the CNVs could be pathogenic. Interestingly, multiple rare CNVs were observed in 28 individuals with ASD and ID (17%), but only one individual with Asperger syndrome out of 33 (3%, P=0.025; Table 2). This result suggests that combinations of CNVs are pathogenic because of additive or epistatic effects and that a burden of rare CNVs more frequently leads to low-functioning autism.

Table 2 Comparison of the number of CNVs identified in patients with ASD+ID and HF-ASD

Hemizygous deletions inherited from asymptomatic parents could unmask autosomal recessive variants located on the other allele.21 The coding regions of three patients with deletions encompassing one or more candidate genes, selected on the basis of their expression in the brain and their function, were further analyzed. No pathogenic variants were detected by sequencing of SMARCA2 and DOCK8 on the non-deleted allele of the girl with the 9p24 deletion. Similarly, sequencing of ATP2C2, a gene in which variants have been reported to modulate linguistic abilities,22 detected non-pathogenic variants in an adopted female with Asperger syndrome, who had a 86-kb deletion on chromosome 16 containing only ATP2C2 and KIAA1609 (family 725). In contrast, sequencing of DOCK10 and NYAP2 in a male patient with a deletion of 1.9 Mb on chromosome 2q inherited from his asymptomatic mother, which contained only these two genes (family 625), identified a rare missense variant at the hemizygous state in DOCK10 (c.6460G>A/p.Asp2154Asn, rs111356042, minor allele frequency=0.0004 in the European population and 0.0369 in the African-American population). It was predicted by Polyphen-2 to be possibly damaging. The patient, who was the only affected child of the couple, had both the deletion and the missense variant, unlike his siblings, suggesting that the association of the two variants could lead to an autosomal recessive disorder (Figure 3).

Figure 3
figure 3

Identification of a deletion encompassing DOCK10 associated with the c.6460 G>A/p.Asp2154Asn in a male patient. (a) SNP array profiles of the patients with the deletion and details of the genes included the deletion: the Y axis indicates the log R (above) and the B allele frequency (below); the X axis indicates the position on chromosome 2. (b) Pedigree of family 625 and segregation analysis of the deletion and the c.6460G>A/p.Asp2154Asn variant. (c) Sequence electropherograms showing the c.6460G>A/p.Asp2154Asn variant in the hemizygous state in the proband of family 625 and in the heterozygous state in his father. (d) Alignment of the region flanking c.6460G>A/p.Asp2154Asn in orthologous proteins showing the conservation of the altered amino acids.

Although our analysis first focused on CNVs containing coding regions, we also detected six rare CNVs, in six patients, who were either intronic or intergenic but located within or close to genes encoding contactins or neuroligins: an intronic duplication in NLGN1 was detected in three patients, a duplication between CNTN4 and CNTN6 was present in three patients and intronic deletions in CNTN4 and CNTN5 were identified in two unrelated individuals. Furthermore, five intronic CNVs absent from databases and encompassing genes involved in neurotransmission or development of the nervous system (AGBL4, LRRTM4, UNC5C, CADM1, GPHN) were identified in five individuals (Supplementary Table S2). These variants could contribute to ASDs by altering regulatory regions of the genes, although this hypothesis is difficult to prove in the absence of expression data.

Discussion

Many studies have recently demonstrated the important contribution of rare CNVs to the genetic puzzle of ASDs. However, the clinical significance of CNVs in individuals with ASD has been more rarely investigated.23, 24 In this study, we report the results of a diagnostic analysis using Illumina SNP microarrays in 194 subjects with ASDs who had been recruited prospectively. The cohort initially included 200 patients, but six individuals were excluded from the CNV analysis because a diagnostic was established by preliminary screenings. CNVs encompassing at least one exon and rare in the DGV were found in 94 subjects with ASD (48%). Yet, this analysis allowed us to make a probable diagnosis in only 3 of the 194 patients (1.5%). However, if we consider the six patients with Smith–Magenis syndrome (n=2), Williams–Beuren syndrome (n=1), supernumerary chromosome 15 (n=2) and fragile X syndrome (n=1) previously detected by FISH analysis, karyotyping, RAI1 sequencing and southern blotting, and excluded from the microarray analysis, the number of unambiguous diagnoses reaches 4.5% (9/200).

Our results confirm the frequent implication of gains of copies in the 15q11–q12 region in ASD, found in 3 out of 200 patients (1.5%). The male subject with the 15q11–q12 triplication from paternal origin had a very mild phenotype, compared to other patients with triplications of this region, including relatively good preservation of language and acquisition of reading and writing. On the contrary, the two patients with supernumerary chromosome 15 (detected before the microarray analysis) had a maternal origin and had severe ID and epilepsy in addition to autism. Duplications and triplications of this imprinted region in ASD individuals are usually of maternal origin. However, duplications of paternal origin have occasionally been reported.13, 25 Our findings indicate that 15q11–q12 duplications and triplications of paternal origin might be less penetrant or associated with a milder phenotype than those of maternal origin. This suggests that dysregulation of both imprinted genes (including UBE3A) and non-imprinted genes in the 15q11–q13 region contribute to the development of ASDs. Interestingly, both imprinted (UBE3A and SRNPN) and non-imprinted genes (OCA2 and GABRG3) located in the triplicated region had abnormal methylation values in the patient with the 15q11–q12 triplication. Epigenetic dysregulations affecting the genes encoding GABAA receptors in the brains of individuals with 15q11–q13 copy number abnormalities have been reported previously.26

The second structural anomaly that was considered as probably causative is a deletion of the 9p24 region. Deletions encompassing this region, which are usually de novo but larger in size (Supplementary Figure S3), are known to cause the 9p deletion syndrome, a disorder characterized by ID, trigonocephaly, facial dysmorphism and sexual reversion in males. Interestingly, autism was reported in 4 out of 100 patients with this syndrome,27 suggesting that this region contains genes contributing to autism in certain conditions. Moreover, deletions on chromosome 9p24.3 encompassing C9orf66, CBWD1, DOCK8 and FOXD4 were identified recently in two individuals with ASD (frequency estimated to 0.05%).5 The deletion found in our female patient encompassed 13 genes, including DOCK8 and SMARCA2. It is likely that this deletion has occurred de novo in the proband, who had severe language impairment and West syndrome in addition to autism, as were most deletions encompassing this region in the literature, although this could not be formally demonstrated due to the unavailability of the DNA of the father. SMARCA2 encodes a protein that belongs to the FMRP complex, deficient in fragile X syndrome and is involved in a large network of interactions with other ASD related proteins such as TSC1 and NLGN3.28, 29 To test the hypothesis that additional variants in SMARCA2 and/or DOCK8 contribute to autism, we sequenced the exons of these two genes in the patient with the 9p deletion, but no rare, possibly pathogenic variants were identified. Recently, heterozygous nonsynonymous mutations or partial deletions in SMARCA2 were identified as the cause of Nicolaides–Baraitser syndrome, an autosomal dominant condition with syndromic ID.30

The third probably causal abnormality was a de novo deletion on chromosome 3q29. The 3q29 microdeletion syndrome has been recently described as a new syndrome, probably caused by nonallelic homologous recombination.31 It was usually associated with dysmorphism, mild-to-moderate ID, variable congenital malformations and ASD in 27% of cases.32 This CNV was also recently found in schizophrenic individuals.33 Our patient with this deletion had not the classical dysmorphism and had a normal intelligence. Only two patients with normal intelligence and 3q29 deletion were previously described.34, 35 Among the 22 genes included in the 1.6 Mb deletion, several genes including FBXO45, DLG1, BDH1 or PAK2 (involved in brain development or synaptic transmission) and NCBP2 (involved in RNA processes such as splicing, translation regulation, nonsense-mediated mRNA decay, RNA-mediated gene silencing by microRNAs and mRNA export) represent candidate genes possibly contributing to ASD.

Unlike these three microrearrangements, the contribution of the identified CNVs to ASD remained unclear in most cases, such as the 16p11.2 duplications identified in four individuals. Deletions and duplications involving a pericentromeric region of chromosome 16 that spans 600 kb and contains 28–35 genes are recurrent and reciprocal because of highly homologous segmental duplications. Although the duplication was repeatedly associated with autism and schizophrenia,5, 36, 37, 38 it is characterized by great phenotypic variability, ranging from normal phenotypes to severe congenital malformations, and does not segregate perfectly with autism in multiplex families. These observations suggest that the 16p11.2 duplication is associated with autism with low penetrance or that other unidentified factors interact with the duplication to determine its phenotypic expression.

The CNV was inherited from asymptomatic parents in 96% of the cases in which inheritance could be assessed. Inherited CNVs possibly have a role in the etiology of ASDs in combination with other genetic factors in the deletion/duplication or elsewhere in the genome. To identify recessive mutations located on the undeleted allele, we selected three patients with rare deletions encompassing a small number of candidate genes selected on the basis of their function and expression in brain. This approach allowed us to detect a rare, possibly deleterious missense variant in DOCK10 in trans with a whole gene deletion. This variant and the deletion were associated in the index case but not his healthy brother and sister, which is compatible with autosomal recessive inheritance. CNVs involving DOCK10 are extremely rare; they were not found in the DGV or in 8329 controls, and were present in only 2 out of 15 767 patients.38 CNVs encompassing other members of the DOCK (Dedicator of CytoKinesis) family, such as DOCK4 and DOCK8 (located in the 9p24 deletion), have been reported in ASDs.5, 39 However, additional cases of ASD with mutations in this gene need to be identified to confirm that autosomal recessive mutations in DOCK10 can cause ASD.

Another possible case of autosomal recessive ASD was a female with a homozygous deletion on chromosome 1p31.1, encompassing alternatively spliced exons of the PTGER3 gene. This homozygous deletion spanned 43 kb and was located in a larger (15 Mb) homozygous region. PTGER3 encodes EP3, one of four receptors for prostaglandin E2, which has many biological functions, including inhibition of gastric acid secretion, inhibition of sodium and water reabsorption in kidney, uterine contractions, fever in response to exogenous and endogenous stimuli, and modulation of neurotransmitter release in central and peripheral neurons.40 Mice lacking Ep3 have a mild, complex phenotype, including an impaired febrile response,41 an impaired response of adrenocorticotropic hormone to bacterial endotoxin,42 exaggerated allergic inflammation,43 obesity, increased motor activity44 and increased survival to bacterial infection.45 PTGER3 is expressed as at least nine multiple splicing variants that have identical ligand binding properties but interact with different second messengers.46 The homozygous PTGER3 deletion identified in our patient alters six of the nine isoforms (Supplementary Figure S5). This patient, who was 5 years old when last examined, had typical autism with a mild delay. She was very agitated, tall and not obese; her history of infection and fever was unremarkable.

Altogether, our results show that genetic syndromes associated with genomic deletions or duplications are a rare cause of ASDs. The search for CNVs in individuals with ASD should be pursued, however, as it can lead to a diagnosis and appropriate genetic counseling in a small number of cases. Association of CNV analysis with exome sequencing to systematically search for autosomal recessive causes of ASD could increase the proportion of positive diagnoses. However, oligogenic or multihit models, in which the association of rare inherited CNV with point mutations contributes to the disorder, are probable in many cases of autism. Two of our subjects support this hypothesis; each had two novel, private CNVs inherited from asymptomatic parents, suggesting that one of the CNVs was not sufficient to cause ASD but that a combination of the two rearrangements was deleterious. The observation of multiple rare CNVs in 17% of the subjects with ASD and ID, but only one with Asperger syndrome (3%), also suggests that the effects of CNVs are additive or that some combinations are pathogenic. Two recent studies further support the theory of the ‘two-hit model’ in individuals with ASDs.19, 47 The next challenge in ASDs is therefore to understand the interactions between rare CNVs and other rare variants in individual patients and to take into account environmental interactions that may also modulate the risk for developing ASD.48