Background: The thrifty genotype hypothesis proposes that genetic susceptibility to type 2 diabetes results from the positive selection of “thrifty” alleles in the past. A corollary of this hypothesis is that genetic variants protecting against the development of diabetes are “unthrifty” and thus subject to negative selection during human evolution.
Methods: It was assessed whether age estimates of the diabetes protective PPARG Ala12 allele indicate effects of natural selection. Based on published data from four populations, the date of origin of the diabetes protective PPARG Ala12 variant was estimated using both allele frequency and linkage disequilibrium (LD) with the C1431T single nucleotide polymorphism in exon 6 of the PPARG gene.
Results: The best LD based estimate of the age of the Ala12 allele gave an average of ∼32 000 years with a maximum upper bound of ∼58 000 years. Assuming a population with a growth rate of r = 0.01 per generation, the frequency based estimate of the age of the Ala12 variant gave an average of ∼27 000 years with a maximum upper bound of ∼42 000 years.
Discussion: The similarity of both time estimates is consistent with selective equivalence of the diabetes protective PPARG Ala12 allele and the diabetes susceptible PPARG Pro12 allele.
- 95% CI, 95% confidence interval
- LD, linkage disequilibrium
- SNP, single nucleotide polymorphism
- allele age
- linkage disequilibrium
- thrifty gene
Statistics from Altmetric.com
According to the thrifty genotype hypothesis, widespread genetic susceptibility to type 2 diabetes is the result of the positive selection of “thrifty” alleles in the past. A corollary of this hypothesis is that genetic variants protecting against the development of diabetes are “unthrifty” and thus subject to negative selection during human evolution. It was assessed whether age estimates of the diabetes protective PPARG Ala12 allele indicate effects of natural selection.
Based on published data from four populations (Scottish, Chinese, Malays, and Indians), the date of origin of the diabetes protective PPARG Ala12 variant was estimated using both allele frequency and linkage disequilibrium (LD) with the C1431T single nucleotide polymorphism (SNP) in exon 6 of the PPARG gene.
Taking geographical distribution into account, the best LD based estimate of the age of the Ala12 allele gave an average of ∼32 000 years with a maximum upper bound of ∼58 000 years. Assuming a population with a growth rate of r = 0.01 per generation, the frequency based estimate of the age of the Ala12 variant gave an average of ∼27 000 years with a maximum upper bound of ∼42 000 years.
The similarity of both time estimates is consistent with selective equivalence of the diabetes protective PPARG Ala12 allele and the diabetes susceptible PPARG Pro12 allele. Evidence of positive selection was found in the Indian sample, suggesting that the selective value of the Ala12 allele may be modulated by the neighbouring T1431 variant.
The thrifty genotype hypothesis1 states that susceptibility to type 2 diabetes is in part due to adaptation to ancient conditions of life during the course of human evolution. It is postulated that our hunter gatherer ancestors faced times of famine and thus genetic variants that allowed conservation of glucose and efficient storage of fat as an energy reserve would offer a selective advantage to their carriers. Given the modern conditions in affluent societies of plentiful availability of food, these “thrifty genotypes” became deleterious leading to the development of type 2 diabetes. A corollary of this hypothesis is that genetic variants that afford protection from disease are mutations that cause a “loss of thriftiness”,2 for example inefficient fat storage or renal loss of glucose. Another consequence, not yet recognised, of the thrifty genotype hypothesis is that these “unthrifty alleles” must have suffered negative selection during human evolution.
Because of its multiple roles in adipogenesis, energy homeostasis, and insulin sensitivity, the nuclear receptor PPARG is thought to be a major regulator of the “thrifty gene response”.3 The Ala12 variant of the PPARG gene has been associated with decreased receptor activity, lower body mass index, and higher insulin sensitivity compared to the wild type Pro12,4 and it has been shown to be protective against type 2 diabetes in different ethnic groups.5–7 These results suggest that the Ala12 variant could be an “unthrifty allele” of the PPARG gene and therefore exposed to negative selection in the past.
The Ala12 allele of the PPARG gene has a wide geographical distribution. Table 1 shows a non-exhaustive list of the frequency of this allele in different populations. It is clear that European populations have the highest frequency of the Ala12 allele and East Asian populations have the lowest frequency. With the available data it is unclear whether this genetic variant is present in Africa. The low frequency of the allele in African-Americans does not rule out the possibility of recent genetic flow from non-African groups. Age estimates of the Ala12 allele may provide some light about its origin and, more importantly, they can be used to test the hypothesis of neutrality of this allele.
Age estimates can be obtained using both linkage disequilibrium (LD) and allele frequency. By using haplotypic information in two linked microsatellite loci, Stephens et al8 estimated that a 32 base pair (bp) deletion in the CCR5 locus that confers resistance to infection by HIV occurred about 700 years ago. However, the high frequency of this deletion in Europe (∼10%) suggested an age greater than 100 000 years under constant population size. Because the pattern of LD suggested a much younger age than that indicated by the allele frequency, this large difference in age estimates was interpreted to be the result of positive selection of the deletion in Europeans.8 This same approach can be applied to the Ala12 allele of the PPARG gene. In particular, if the Ala12 allele is an “unthrifty allele” that suffered negative selection in the past, then the pattern of LD should indicate an older age than is consistent with its frequency.
The C1431T SNP is located in exon 6 of the PPARG gene separated by 82 kb from the Pro12Ala polymorphism. Tai et al6 showed the haplotypic frequencies of both variants in Chinese, Malays, and Indians, and Doney et al7 reported these frequencies in a Scottish cohort. Thus, it is possible to estimate the age of the Ala12 allele using both LD and allele frequency.
Allele age based on linkage disequilibrium
By using a first order approximation of the likelihood9 and the single linked C1431T marker locus, a maximum likelihood estimate of the age (in generations) of the Ala12 allele is
where θ is the recombination rate per generation between both polymorphisms, k1 is the number of chromosomes carrying both the T1431 and the Ala12 alleles, k2 is the number of chromosomes having both the C1431 and the Ala12 alleles, p1 is the frequency of the T1431 allele on Pro12 chromosomes, and p2 is the frequency of the C1431 allele on Pro12 chromosomes. Because in all the surveyed populations the Ala12 allele is in high linkage disequilibrium with the T1431 variant (D′>0.500), it is assumed that the mutation originating the Ala12 allele first arose in a chromosome carrying the T1431 allele. Three different recombination rates were used: θ = 0.0008 which corresponds to an average rate of 1 Mb–1 cM, θ = 0.0016 which is equal to double the average rate, and θ = 0.0004 which corresponds to one half of the average rate. 95% Confidence intervals (95% CI) were estimated based on the composite likelihood given by Guo and Xiong9 (see equation 12 in their paper).
Allele age based on frequency
Slatkin10 found that a maximum likelihood estimate of age as a function of allele frequency in a growing population is equal to
where r is the population growth rate per generation, N0 is the current size of the population, and x is the frequency of the Ala12 allele. 95% CI can be estimated using the following cumulative probability distribution of allele age12:
where n is the sample size and τ(t) is a scaled time that takes into account the changes in population size at t generations in the past. A growth rate of r∼0.02 per generation for the last 10 000 years has been estimated for European populations.12 Data from Biraben13 show historic growth rates in the range of 0.01–0.04 for Asian populations. Therefore, to estimate the age of the Ala12 allele values of r = 0.01, r = 0.02, and r = 0.04 were used.
Results and implications
Table 2 shows the age estimates in years of the Ala12 allele. These time estimates were calculated assuming a generational length of 25 years.
Except for the results from the Indian sample, all the allele age estimates based on linkage disequilibrium are consistent with the age estimates assuming neutrality of the Ala12 allele in a growing population. As expected, time estimates greatly depend on the assumptions used. However, the presence of the Ala12 allele in Europeans and Asians indicates that this allele first arose, at the latest, about the time of the split between both population groups ∼43 000 years ago,14 a fact that agrees with the estimated allele ages assuming either a recombination rate of θ = 0.0004 (average age ∼32 000 years, maximum upper bound of ∼58 000 years) or a population growth rate of r = 0.01 (average age ∼27 000 years, maximum upper bound of ∼42 000 years). If the true recombination rate between the Pro12Ala and the C1431T SNPs is ∼0.0004, then, only by assuming a fast growing population (r = 0.04), it is possible to reject the hypothesis of neutrality. In particular, in this last scenario, the LD pattern indicates an older age of the Ala12 variant compared to the time estimates based on allele frequency, therefore suggesting negative selection against the Ala12 allele.
Although the estimated ages of the Ala12 allele are consistent with the hypothesis of neutrality, other explanations must be considered. First, it is possible that selection over the Ala12 allele was not sufficiently strong to produce evidence using available data; the use of just one linked locus in the present analysis may be insufficient. Information regarding both more linked loci and the nucleotide sequence of the PPARG gene in several populations would be helpful to settle this question concerning selection of the Ala12 allele. Also, it may be argued that there was insufficient time for the Ala12 allele to undergo negative selection given the fact that agriculture originated ∼12 000 years ago during the Neolithic period. However, if the present age estimates are correct and based on its geographical distribution, the Ala12 allele may be at least ∼43 000 years old, 1000 human generations before the dawn of agriculture, providing enough time for natural selection. In addition, a consequence of the agricultural revolution was that Neolithic humans depended mostly on just one or a few plants, and often suffered from starvation due to crop failure,15 an event also common during historical times. For example, Japan endured more than 200 famines after the sixth century and France had ∼75 crop failures between 500 and 1500 AD.15 Therefore, selection pressure did not disappear following the Neolithic agricultural revolution.
The lack of evidence of negative selection of the Ala12 allele could be also due to fitness modulation by neighbouring polymorphisms. In particular, the T1431 variant has been associated with opposing phenotypes such as higher body mass index16 and greater total carotid plaque volume17 as compared to the Ala12 allele. Also, it has been shown that the diabetes protective effect of the Ala12 allele is weakened by the presence of the T1431 allele.7 These observations raise the hypothesis that the fitness of the Ala12 allele depends on the presence of the T1431 variant. Chromosomes carrying both Ala12 and T1431 alleles could be “thrifty haplotypes” that have undergone positive selection, and chromosomes with both Ala12 and C1431 variants could be “unthrifty haplotypes” that have experienced negative selection.
The last hypothesis may explain in part the anomalous results from the Indian sample. The LD based age estimates are consistently younger than estimates based on allele frequency and therefore suggest positive selection of the Ala12 allele. The Indian sample showed the highest LD (D′ = 0.799) of the four samples analysed. Around 83% of the chromosomes carrying the Ala12 allele also carried the T1431 variant in the Indian sample as compared to ∼66% in the other three samples. This means that in the Indian population Ala12−T1431 “thrifty haplotypes” are in the majority, indicating positive selection of the Ala12 allele in this particular sample.
Overall, time estimates based on LD are similar to estimates based on allele frequency. The geographical distribution of the Ala12 allele in European and Asian populations is consistent with a recombination rate with the C1431T SNP of θ∼0.0004 and a population growth rate of r∼0.01. If these age estimates are correct, they indicate that the mutation giving rise to the Ala12 variant occurred after the genetic split between African and non-African populations, which is why this allele may be absent in Africans. Because there are no published frequencies in Africans, it is unclear whether the Ala12 allele is present or not in African populations. Hara et al5 found just one heterozygous individual out of 46 non-diabetic African-American women with polycystic ovary syndrome, implying a frequency of ∼1% of the Ala12 allele in that sample. Since African-Americans have an average of ∼25% of Caucasian genes,18 the presence of the Ala12 allele in African-Americans may be due to recent genetic flow from US Caucasians.
The thrifty genotype hypothesis1 was first postulated to explain the widespread susceptibility to type 2 diabetes in modern human populations. A corollary of this hypothesis is that genetic variants conferring resistance to disease are “unthrifty alleles”2 subject to natural selection in the past. By examining published data on the Ala12 allele of the PPARG gene, it was not possible to reject the hypothesis of neutrality of this genetic variant. Under a reasonable range of population scenarios, available data are compatible with neutrality of the Ala12 variant. However, it cannot be ruled out that the Ala12 allele has undergone natural selection in the past, with its selective value depending on neighbouring polymorphisms. It is likely that the T1431 variant modulates the selective value of the Ala12 allele. Evidence of positive selection of the Ala12 variant in the Indian sample supports this hypothesis, and indicates the difficulty of assigning selective values to single genetic variants. Information about more linked loci and about the nucleotide sequence of the PPARG gene in several human populations, as well as from other genes, will be required to elucidate the evolutionary history of genetic variants associated with a predisposition to type 2 diabetes.
Competing interests: none declared
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.