Article Text

This article has a correction. Please see:

Download PDFPDF

The fragile X prevalence paradox
  1. Paul J Hagerman
  1. Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis 95616, USA
  1. Professor P J Hagerman, Department of Biochemistry and Molecular Medicine, University of California, Davis, School of Medicine, 4303 Tupper Hall, One Shields Ave, Davis, California 95616 USA; pjhagerman{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Although fragile X syndrome (FXS; OMIM 300624) is generally regarded as the most common inherited form of cognitive impairment,13 there is little consensus as to its prevalence in the general population or to sex-specific differences in prevalence. Estimates of FXS prevalence (∼1/4000–1/8000) that are based on population projections from cohorts of children with special education needs (SEN) generally underestimate the extent of clinical involvement (for a comprehensive summary, see Song et al 4), as many individuals affected by the behavioural, emotional and/or learning disabilities of FXS have IQs in the normal or borderline range.5 6 The latter may not be included in cohorts that use cognitive impairment as an inclusion criterion, a problem that is particularly marked for girls, with the majority having IQs within the normal range.7

A second difficulty with population studies is the tendency to conflate disease prevalence (projections from SEN cohorts) with allele/carrier frequencies within the general population, as prevalence estimates will only approach carrier frequencies for complete penetrance. Thus, seemingly paradoxical results among studies of prevalence may reflect the effects of selection bias, conflation and differing defining criteria for FXS across studies. Aspects of this “paradox” are exemplified by two important screening studies of Israeli women. Using the same cut-off point (55 CGG repeats) for premutation (PM) carriers, Toledano-Alhadef et al8 found a higher frequency of carriers (127/14 334 = 1/113) than did Pesso et al9 (62/9459 = 1/152), despite the fact that Pesso et al reported a higher frequency for full mutation (FM) alleles (4 FM carriers; 1/2365) than did Toledano-Alhadef et al (3 FM carriers; 1/4778). Although the numbers of FM alleles in the two studies are too small to attach significance to the difference in frequencies, the trend is consistent with the exclusion of individuals with a family history of learning difficulties by Toledano-Alhadef et al,8 whereas such individuals were not excluded by Pesso et al.9 Samples approaching 50 000 would be needed to establish significance of the difference between the observed frequencies. Interestingly, there was no significant difference between the two studies for the allele distributions within the PM ranges. Indeed, Toledano-Alhadef et al8 observed a greater percentage of PM alleles of >70 CGG repeats than did Pesso et al.9

A generally unexploited feature of the genetics of FXS is that there is a defined (albeit uncharacterised) relationship between the frequencies of PM alleles (which are relatively easy to determine in an unbiased manner) and the frequencies of FM alleles in a given population. Thus, it is possible to use the frequencies of PM alleles to define expectations for the number of FM alleles. As noted above and in Song et al,4 the best population estimates of PM carrier frequencies have been obtained through large-scale screening of normal pregnant or preconception women (eg 62 of 9459 subjects;9 127 of 14 334 subjects,8). Although these frequencies are themselves subject to a number of caveats, including exclusion of individuals with known family history of cognitive impairment and possible regional founder effects, they can be used as a starting point for providing expectations for the frequency of male and female individuals who harbour FM alleles and of male individuals with PM alleles. The other advantage of using the frequency estimates for female (PM) carriers is that there are published estimates for the CGG repeat-dependence of both the probability (pi; i =  number of CGG repeats) of PM to FM transmission10 and the relative frequency (fi) of PM alleles.11 These two quantities can be used to estimate an aggregate probability, S, of the transmission (loss) of a PM allele to a FM allele. S, in turn, can be used to estimate the remaining frequencies expected within the same population. S is obtained by summing the product of the fraction (fi) of premutation alleles of i CGG repeats and the probability (pi) of a PM to FM transition for the same allele size (i); thus,

Embedded Image

Strictly speaking, these are lower-bound estimates for FM allele frequency, as the less common FM to FM transmissions are not counted.

Relative frequency (fi) was estimated by a cubic fit to the allele data in table 3 of Jacquemont et al,11 correcting for the range of CGG repeat sizes in each grouping, yielding

Embedded Image

where i is the number of CGG repeats (55⩽i⩽100). The corresponding transition probability was estimated by fitting the data in table 1 of Nolin et al,10 yielding

Embedded Image

Both cubic fits were truncated to 100 CGG repeats, as the product function is essentially zero beyond that range.

Using an aggregate value (189/239,793; 1/126) for the frequency of PM carriers from the studies of Pesso et al9 and Toledano-Alhadef et al,8 the expected (average) frequency for FM males and females is 1/126×0.5×S  = 1/2355 and the expected frequency for PM males is 1/126×0.5× (1−S)  = 1/282. Remarkably, the predicted value for females with the FM allele is very close to the observed value (1/2365) from the study of Pesso et al,9 which did include women with a history in the extended family of learning difficulty or developmental problems. Clearly, given the uncertainties inherent in the analysis, as well as the use of the aggregate allele frequency, such agreement must be considered fortuitous. The lower value (1/4778) observed by Toledano-Alhadef et al8 probably reflects the results of excluding any family history of learning difficulty. An important prediction of the higher frequency estimate (1/282) for PM alleles in males is that as many as ∼1/3000 males aged >50 years in the general population may have the carrier-specific neurodegenerative disorder, fragile X-associated tremor/ataxia syndrome (FXTAS12 13).

Implicit in the foregoing analysis is the assumption that, whereas the allele frequencies will probably depend on a specific population being studied, the transmission fraction, S, is likely to be a more general function of the genetic mechanisms underlying CGG-repeat expansion. A test of this assumption (constancy of S across populations where absolute carrier frequencies may vary widely) is important to our understanding of the factors involved in dynamic repeat instability. In this regard, a second pair of screening studies in eastern Canada14 15 found lower PM alleles frequencies for both female (41/10 624 = 1/259)14 and male (13/10 572 =  1/813) subjects.15 However, the allele frequency for males (1/580) predicted from the female allele frequency and S (0.107), is within the 95% confidence limits of the observed frequency for males. These results suggest that founder effects may contribute to the approximately two-fold difference in absolute allele frequencies between the Canadian and Israeli populations.

It is hoped that the current comments will help to frame the discussion of both prevalence and frequency estimates for the fragile X family of disorders and to provide further impetus for larger scale screening of unbiased populations (eg newborn screening). To help frame this discussion, several points should be considered:

  1. As noted above, an as yet untested prediction of the dynamic instability of the FMR1 gene is that, whereas population founder effects will result in large variation in allele frequencies, the transmission fraction, S, should be relatively constant, reflecting underlying genetic mechanisms for expansion.

  2. The frequency of PM alleles in a given population is likely to represent the most robust measure of variation (eg, founder effects) across populations, and estimates of this quantity should be the starting point for discussions of corresponding frequencies in males and in individuals (males and females) harbouring FM alleles.

  3. The transmission model, based on the observation that PM to FM transmission is strictly matrilineal, predicts equal numbers of FM alleles in males and females. Screens for FM alleles in at-risk populations that yield substantial frequency differences for males and females are likely to possess significant selection bias.

  4. As a corollary to (3), prevalence values for fragile X syndrome that are substantially lower than the corresponding frequency estimates for FM alleles should always raise the possibility that some criteria for clinical involvement (eg lowered IQ) are too restrictive. FXS also includes individuals within the normal/borderline IQ range who have learning deficits and emotional and/or behavioural difficulties. Starting from the vantage point of allele status, as obtained through newborn screening, a much better foundation would be laid for defining the true nature and variation of the phenotypic spectrum of FXS.

  5. Direct, unbiased estimates of the frequency of FM alleles in a given population will require the screening of samples of at least 50 000 individuals. Such efforts will require new tools for high-throughput genotyping that are capable of direct detection of FM alleles in both males and females.16

In summary, despite numerous estimates of the frequencies of FM (FMR1) alleles, or prevalence of FXS, direct (general population) frequency estimates are still lacking, particularly for males. In lieu of such studies, perhaps the best current estimate for the frequency of the FM in females is ∼1/2500, based on direct screening9 and by projection from the frequency of PM alleles (see above). This frequency estimate should also apply to male carriers of FM alleles, a prediction that remains to be tested by direct population screening. This prediction is consistent with the lower-bound estimate of ∼1/36007 from a male SEN cohort within the USA. It is therefore recommended that the oft-quoted 1/4000–1/6000 figures for FXS prevalence be abandoned in favour of an approximate frequency of ∼1/2500 for individuals (male and female) with the FM allele. This frequency should be only slightly higher than the prevalence of FXS if the full spectrum of involvement is considered. Of course, these numbers are expected to display geographical variation due to founder effects.


I thank the National Fragile X Foundation for their support. This work was also supported by the National Center for Research Resources (UL1 RR024922) and the National Institute of Aging (RL1 AG032119). I also thank Drs F Tassone, R Hagerman and D Nguyen for their helpful comments and discussion regarding this work.



  • Competing interests: None.

Linked Articles

  • Correction
    BMJ Publishing Group Ltd