Introduction

Developmental dyslexia, or reading disability, is one of the most common complex neurobehavioral disorders of children and adults with prevalence rates ranging from 5 to 17.5 percent (Shaywitz and Shaywitz 2003). It is characterized by an impairment of reading ability despite normal intelligence and adequate educational opportunity. Estimates of the heritability of quantitative measures of dyslexia range from 44 to 75%, suggesting a significant genetic component (DeFries et al. 1987).

The first dyslexia locus, DYX1 (MIM:127700), was identified on chromosome 15 by linkage analysis of nine multi-generation pedigrees with 21 polymorphic protein markers and chromosomal heteromorphisms (Smith et al. 1983), but linkage could not be confirmed in a follow-up study of the same families with five short tandem repeat markers from 15q (Rabin et al. 1993). Similarly, a linkage study of 14 Danish dyslexia families with markers from 15q found negative LOD scores (Bisgaard et al. 1987). However, three subsequent independent studies from the US, Germany, and the UK, characterized and confirmed a dyslexia locus on15q (Grigorenko et al. 1997; Morris et al. 2000; Schulte-Korne et al. 1998).

Taipale et al. (2003) recently localized a candidate gene for DYX1 in subjects from Finland. They reported that EKN1 (NCBI ID: gi18677736)—which they named DYX1C1 (MIM:608706)—was disrupted by a translocation (2;15) (q11;21) that co-segregated with dyslexia in a two-generation family (Nopola-Hemmi et al. 2000). In supportive case-control studies they also found association in 55 cases compared to 113 controls at two sequence changes, −3G→A and 1249G→T (p=0.006 and p=0.02, respectively). However, the dyslexic cases were from only 20 families; in addition some of the controls were related to each other and the cases, raising questions about the validity of a case-control comparison where the individuals are not independent. In a replication sample of 54 independent cases and 82 controls, the −3G→A polymorphism showed a significant but weaker association (p=0.02), and results with the 1249G→T polymorphism were not significant (p=0.1). A small TDT-association analysis of nine informative trios gave significant results for the two markers together as a haplotype (p=0.025). In characterizing the EKN1 protein, they found that it was rapidly up-regulated and translocated following brain ischemia.

To assess the contribution of EKN1 to dyslexia in an independently ascertained sample, we studied 150 nuclear families from a Colorado dyslexia twin cohort for the two reported polymorphisms by quantitative transmission disequilibrium test (QTDT) analysis (Allison 1997).

Subjects and methods

Dyslexia family samples

The Colorado twin dyslexia cohort was ascertained through the Colorado Learning Disabilities Research Center (CLDRC) (DeFries et al. 1987). Predominantly white middle-class families were ascertained from school districts in the state of Colorado, where at least one sibling had a school history of reading problems. Subjects included members of monozygotic (MZ) twin pairs (in which case, only one member of the MZ twin pair was used), dizygotic (DZ) twin pairs, and nontwin siblings. Subjects for whom English was a second language were not included in the sample. Subjects with evidence of serious neurological, emotional, or uncorrected sensory deficits were also excluded.

Genotyping for the variants reported in the study by Taipale et al. (2003) was performed on a subset of these twin families, consisting of 522 subjects (parents and siblings) from 150 nuclear families. Genotype data was available from both parents for 100 families. The only selection criterion for this subset was that DNA had been isolated from blood rather than buccal samples. The average age of the 221 offspring was 11.55 years, ranging from 8.02 to 18.53 years.

This study was approved by the human research committees of the University of Colorado at Boulder, the University of Nebraska Medical Center, the University of Denver, and Yale University.

Phenotyping

Subjects were brought to the CLDRC for an extensive battery of psychometric tests, which consisted of many cognitive, language, and reading tasks, including the intelligence quotient (Wechsler 1974, 1981) and the peabody individual achievement test (PIAT) (Dunn and Markwardt 1970). Quantitative-trait data were provided for the following nine individual phenotypes and three composites: orthographic coding, orthographic choice, homonym choice, phonological decoding, phonemic awareness, phoneme transposition, phoneme deletion, timed word recognition, standardized PIAT word recognition, orthographic coding composite, phonemic awareness composite, and discriminant score. These psychometric tasks have been described in detail elsewhere (DeFries and Fulker 1985; DeFries et al. 1997; Gayán et al. 1999; Olson et al. 1989). The population average was estimated from the large twin database available at the CLDRC. After age regression and standardization, the phenotypic data for each of the reading tasks formed a continuous distribution of quantitative z scores, which were used in the analyses.

Whole genome amplification

DNA was extracted from blood samples obtained from parents and offspring. For the association studies in the nuclear families, 50 ng of genomic DNA was subjected to whole genome amplification by multiple strand displacement amplification as described (Dean et al. 2002). The quality of amplified samples was tested with two single nucleotide polymorphism (SNP) assays with a combined overall success rate of 83% for both markers. Sequence detection was performed on unamplified samples.

Genotyping

Genotyping was performed with TaqMan Assay-by-Design (Applied Biosystems, Foster City, CA) and with pyrosequencing (Biotage AB, Uppsala, Sweden). Assay-by-Design probes were used to genotype the 1249G→T polymorphism. Polymerase chain reactions (PCR) were performed in a 384-well format in a total reaction volume of 2 μl with 1.6 ng of template DNA. PCR conditions were: denaturation for 10 min at 95°C, followed by 40 cycles of 15 s at 95°C and 1 min at 60°C. Fluorescent signals were converted to genotypes by the sequence detection system software package (SDS, Applied Biosystems, Foster City, CA).

The −3G→A (rs3743205) and the consecutive −2G→A polymorphisms were genotyped via pyrosequencing using two PCR amplification primers (forward PCR primer: ATTAACCCTCACTAAAGGGACAAGCAGGCGCAAGAAGCAACCAG, reverse PCR primer: TTCTAATACGACTCACTATAGGGAGACACGCCTTTGAGGGGCAGAGACAG) and an extension primer (CTAACCTGAAGAGGCATT). A 20 μl PCR reaction contained 10 ng of genomic DNA, 0.4 units of Hotstart Taq polymerase (Qiagen), 4 pmoles of forward PCR primer, 0.4 pmol of reverse PCR primer, and 3.6 pmol of biotinylated T3 primer, 2.5 mM MgCl2, and 200 μM of dNTPs. Thermal cycling conditions were 15 min at 95°C, followed by 45 cycles (30 s at 95°C, 45 s at 56°C, 60 s at 72°C), 5 min at 72°C, and a hold at 4°C. Upon completion of PCR, the biotinylated PCR product from the entire reaction was purified by binding to streptavidin–sepharose (Amersham) using the filter prep tool according to the standard protocol provided by Biotage. The resulting single stranded template was annealed with the extension primer for 2 min at 80°C, cooled to room temperature and sequenced in a PSQ96MA Pyrosequencing instrument. The PSQ96MA software (version 2.0.2) automatically scored the quality of each reaction and assigned genotypes.

The GAS package version 2 was used to test for Mendelian transmission of alleles. We used Simwalk2 (version 2.6.0, Sobel and Lange 1996) to estimate identity-by-descent probabilities and QTDT (Center for Statistical Genetics, University of Michigan, Ann Arbor) to model association and transmission disequilibrium (Abecasis et al. 2000). Transmission disequilibrium compares the number of alleles transmitted from an informative parent (heterozygous at that particular marker) to affected offspring. Since transmission disequilibrium estimates association only in the presence of genetic linkage it is a robust approach if there is known or suspected population stratification. Variance components models of association and permutations for exact p-values were performed with the -wega -m1000 −1 parameters in the QTDT program (Abecasis et al. 2000), which specify environmental and polygenic components and an additive major gene effect under the null hypothesis.

Results

When we simultaneously modeled orthogonal association and variance component linkage within QTDT, neither 1249T nor −3A yielded a significant p-value for any of the quantitative dyslexia phenotypes. The highest χ2 was 1.692 between −3A and phoneme transposition (35/226 probands, p=0.096). When we tested for total association without modeling transmission disequilibrium (-at), the highest χ2 was 2.92 between 1249G→T and discriminant score (151 probands, p=0.0874). When we tested for variance component linkage without association in QTDT, the highest χ2 was 1.993 for both 1249T and −3A with phonemic awareness composite (165 probands, p=0.1580).

Discussion

In our Colorado dyslexia twin cohort we could not confirm the previously reported association with the two most highly associated SNPs described for EKN1 in a TDT study. Differences between the published results from Finland and these from Colorado may be explained by heterogeneity. Subjects in the studies reported here are likely mixtures of several European populations (population heterogeneity), whereas subjects in the Finland sample may be more homogenous and perhaps represent a genetic isolate with a strong founder affect. Populations with multiple founders may have different alleles in disequilibrium with causal mutations and different mutations of the causal gene (allelic heterogeneity). This is supported by the findings in a recent study from Canada in which a novel SNP marker in intron 4 of EKN1, rs1162981, was reported in transmission disequilibrium with dyslexia (p=0.018, 58 informative transmissions) (Wigg et al. 2004), but −3A and 1249T were not. The intronic SNP in the Canada study was not in the region typed in our study.

Taken together, the results reported by Taipale et al. (2003) and Wigg et al. (2004) indicate that polymorphisms of EKN1 are associated with dyslexia, but our results demonstrate that the polymorphisms reported by Taipale et al. (2003) are not causal. It is possible that other untyped polymorphisms in EKN1 may be in association with reading disability phenotypes in our sample, but even if this were true, the results would not establish that EKN1 is the causal gene. Variants in EKN1 could be in linkage disequilibrium with a mutation in a nearby gene. There are several that might be reasonable candidates, such as NEDD4 (NM_006154), and PIGB (NM_004855).

In addition, there could be more than one gene in the 15q region that contributes to DYX1. Studies from Germany and the USA mapped linkage peaks at D15S143, 8 Mb from EKN1 (Grigorenko et al. 1997; Schulte-Korne et al. 1998). Studies from the UK mapped association peaks at D15S994/D15S214/D15S146, 16 Mb from EKN1 (Morris et al. 2000), which were recently confirmed by TDT-association in a study from Italy (Marino et al. 2004). Moreover, EKN1 is 7 Mb away from another translocation breakpoint identified in a second dyslexia family from Finland (Nopola-Hemmi et al. 2000). This ambiguity concerning the location of EKN1 relative to published DYX1 genetic mapping studies suggests there could be heterogeneity due to other genes in this region. This could explain our negative association and the similar findings recently described in two reports from the UK (Cope et al. 2005; Scerri et al. 2004).

Electronic-database information

Biotage AB http://www.pyrosequencing.com GAS (Genetic Analysis System, Alan Young, Oxford University), http://users.ox.ac.uk/∼ayoung/gas.html) Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM QTDT (Center for Statistical Genetics, University of Michigan, Ann Arbor), http://www.sph.umich.edu/csg/abecasis/QTDT