Statistics from Altmetric.com
Since its biochemical characterisation in 19911 and its genetic identification in 1995,2 677C>T allele (T allele) of the 5,10 methylenetetrahydrofolate reductase (MTHFR) gene has been a focus of increasing interest from researchers world wide. The expanding spectrum of common conditions linked with the 677C>T allele now includes certain adverse birth outcomes (including birth defects), pregnancy complications, cancers, adult cardiovascular diseases, and psychiatric disorders.3–8 Although several of these associations remain unconfirmed or controversial,4 their scope is such that it becomes of interest to explore the geographical and ethnic distribution of the allele and associated genotypes.9 Accurate information on such distribution can contribute to studies of gene-disease associations (by providing reference population data) and population genetics (by highlighting geographical and ethnic variations suggestive of evolutionary pressures),10 as well as help to evaluate health impact (by allowing estimates of population attributable fraction).
Current population data, however, show gaps and even for some ethnic groups or large geographical areas (for example, China) few data are available.3 Our aim was to supplement the available data by collecting a large and diverse sample of newborns from different geographical areas and ethnic groups, and to examine international variations in the distribution of the 677C>T allele. We present findings relating to more than 7000 newborns from 16 areas around the world.
MATERIALS AND METHODS
The study was conducted under the auspices of the International Clearinghouse for Birth Defect Monitoring Systems (ICBDMS) and was coordinated through its head office, the International Center on Birth Defects (ICBD).
Participating programmes, in consultation with the coordinating group, identified a population sampling approach that would be simple yet minimise sampling bias with respect to the MTHFR genotype. We made an explicit attempt to sample systematically the newborn population. Details of each programme’s approach are listed below, and further information is available on request.
Generally, programmes chose one of two approaches. The first approach used regional newborn screening programmes as the source of samples. Typically, such an approach used a geographically defined birth population. In Atlanta, for example, researchers visited the Georgia newborn screening programme on different days over several weeks and selected a 1 day collection of blood spots received by the laboratory from children whose mothers resided in one of five counties in Atlanta. Discussions with the director of the newborn screening programme indicated that day to day variability in the flow of specimens from birth hospitals to the state laboratory was negligible. The second approach relied on systematic sampling directly from birth hospitals that were part of an established network. In Spain, for example, staff collected specimens from 15 consecutive newborns at each hospital participating in the ECEMC monitoring programme, which includes birth hospitals from across Spain. Details for specific areas are summarised below.
Our objectives were to characterise the geographical and ethnic distribution of the 677C>T allele (T allele) of the MTHFR gene and its associated genotypes among newborns around the world, using newborn screening programs and birth hospitals. The participants were 7130 newborns of different ethnicities from 16 areas in Europe, Asia, the Americas, the Middle East, and Australia.
The distribution of the allele showed marked ethnic and geographical variation The homozygous TT genotype was particularly common in northern China (20%), southern Italy (26%), and Mexico (32%). There was also some evidence for geographical gradients in Europe (north to south increase) and China (north to south decrease). The TT genotype frequency was low among newborns of African ancestry, intermediate among newborns of European origin, and high among newborns of American Hispanic ancestry. Areas at the extremes of the frequency distribution showed deviations from Hardy-Weinberg expectations (Helsinki, Finland, southern Italy, and southern China).
This study, the largest to date, suggests the presence of selective pressures leading to marked geographical and ethnic variation in the frequency of the 677C>T allele. Geneticists can benefit from these reference data when examining links between the 677C>T allele and health outcomes in diverse populations.
Australia, New South Wales
Specimens were obtained from the New South Wales newborn screening programme, by selecting 100 consecutive newborn screening cards on each of five consecutive days, excluding repeat specimens, for a total of 500 specimens. All maternity units in the state of New South Wales send their specimens to the programme. Specimens consisted of blood remaining after routine newborn screening tests had been completed.
Specimens were taken from consecutive newborns from the provincial newborn screening programme in Alberta. Specimens consisted of the remaining blood spots used by the newborn screening programme. The first 100 specimens of the month were collected each month for four months.
China, northern and southern
Umbilical cord blood samples were collected from newborns from major hospitals in 12 cities in China from March to November 1998. One hundred consecutive samples were requested from each hospital. The hospitals were in cities from southern China (Wuhan, Nanjing, Guangzhou, and Chengdu) and northern China (Yanbian, Urumchi, Changchun, Jinan, Xi’an, Shenyang, Beijing, and Jilin). For homogeneity, only newborns of Han ethnicity were included in the study.
Specimens originated from newborns in the major maternity hospital in Finland, at the Helsinki University Hospital. Sampling was restricted to babies whose parents were both Finns. Specimens consisted of the remainder of umbilical blood specimens for hypothyroidism screening. The latter are collected for every newborn in Finland.
Specimens were taken from consecutive newborns from newborn screening centres in Département du Bas-Rhin, whose births are also covered by the Strasbourg Birth Defects Registry. Specimens consisted of the remainder of blood spots used by the newborn screening programme.
Specimens were taken from consecutive newborns. Specimens were collected from the remainder of the blood spots from the two newborn screening centres that operate in Hungary. For twin pairs, only one of the pair, selected at random, was included.
Israel, Tel Aviv
Specimens were taken from consecutive newborns from one major university hospital in Tel Aviv and consisted of blood spots.
Specimens were taken from consecutive newborns at three hospitals in Campania (two in Avellino, one in Benevento). Specimens consisted of the remainder of blood spots used by the newborn screening programme.
Specimens were taken from consecutive newborns from the newborn screening programme in south east Sicily. Specimens consisted of the remainder of blood spots used by the newborn screening programme.
Specimens were taken from consecutive newborns at one hospital outside the town of Vicenza. The hospital was chosen because it is an area hospital with 1200 births per year that has good obstetric care but does not select for high risk pregnancies. Specimens consisted of the remainder of blood spots used by the newborn screening programme.
Specimens were randomly selected from blood spots from newborns born in hospitals that are part of the RYVEMCE birth defect monitoring network in Mexico. Samples were obtained from the remainder of the blood spot specimens collected for hypothyroidism screening. Selection was stratified to include equal numbers of males and females in the final sample.
The Netherlands, northern region
Specimens were randomly chosen from newborns whose mothers resided in the northern Netherlands. Specimens consisted of the remainder of blood spots used by the newborn screening programme.
Specimens were taken from consecutive newborns in 67 hospitals of the National Health Service throughout Spain. Essentially all babies in Spain are born in such hospitals. These hospitals are part of ECEMC (Spanish Collaborative Study of Congenital Malformations), which monitors one quarter of all births in Spain. Each hospital contributed specimens for 15 consecutive newborn infants during three selected months. Specimens were collected at the same time as the blood spots for the newborn screening programme.
Specimens were selected from the neonatal screening programme that collects and banks specimens from 54 maternity hospitals in the Moscow area. All selected babies were apparently free from congenital anomalies.
USA, Atlanta (Georgia)
Specimens were chosen from newborns whose mothers resided in one of the five counties in metropolitan Atlanta, as ascertained from information on the newborn screening card. Staff visited the newborn screening programme four times over two months. At each visit researchers selected at random the specimens collected during one day and collected specimens from the blood spots left over from newborn screening.
Sample determination and data collection
We determined that a sample size of approximately 400 to 500 specimens per area would provide a reasonably precise estimate (plus or minus 3%) for a genotype with 10% frequency. Such a frequency is within the range reported for the homozygous 677C>T (TT) genotype in many European countries and among North Americans of European descent, and is intermediate between the lower frequencies reported in some populations of African descent, and the higher frequencies reported in specimens from Mexico, Italy, and Hispanics in the USA.3 Although 400 to 500 was the targeted number of specimens per area, smaller sample sizes were accepted, recognising that such samples would provide less precise estimates. For each sample, researchers collected information on sex and race/ethnicity. Ethnicity was determined from the blood spot card (for example, in Atlanta), from maternal interview (for example, Italy, Veneto, Spain), or from the last name or birth place of the parent (for example, Australia). Not all programmes collected all variables.
Human subject protection
Local review boards approved the study. In most cases specimens were anonymised before testing. In all cases, personal identifiers were removed before data were provided to ICBD for epidemiological analysis.
Genomic DNA was isolated from blood spots collected on filter paper. The presence of the C>T change within the MTHFR gene creates a HinfI restriction site that can be detected by restriction enzyme digestion followed by electrophoresis. Amplification of the MTHFR gene by the polymerase chain reaction and detection of the T allele was performed using protocols based on the method of Frosst et al.2
Five programmes (USA, China, Israel, Mexico, The Netherlands) tested their own specimens. All other specimens were tested at a single laboratory (Naples, Italy). The laboratories participated in proficiency testing to ensure inter-laboratory consistency. The proficiency testing consisted of the preparation of punches from 12 blood spots in one laboratory (CDC) with a mix of genotypes (CC, CT, and TT). These genotypes were confirmed by sequencing. Punches from these blood spots were then sent to the other laboratories, which performed DNA extraction followed by genotype assay. The laboratories were blinded to the genotype of these specimens as well as to the relative proportion of the genotypes. Results from each laboratory were sent back to CDC for evaluation. With the exception of one sample, which could not be amplified by four of the six laboratories, results showed complete agreement across laboratories.
We computed the confidence intervals for proportions using the Wilson score method without continuity correction.11,12 Deviation from Hardy-Weinberg equilibrium was tested by chi-square analysis. In addition to allele frequencies, we present genotype frequencies for the homozygous TT genotype (two 677C>T alleles), the heterozygous CT genotype (one 677C>T allele), and the homozygous CC genotype (no 677C>T alleles).
We present data on 7130 newborns from 16 areas in the Americas, Europe, Russia, China, and Australia. Amplification rates for blood spots by geographical source were the following: Italy, Sicily 89%, Italy, Campania 76%, Italy, Veneto 73%, Spain 69%, France, Strasbourg 78%, Finland, Helsinki 95%, Hungary 95%; Russia, Moscow 95%, Australia, New South Wales 79%, Canada, Alberta 77%, USA, Atlanta 96%. By comparison, a large Irish population based study using newborn blood spots successfully genotyped 85% of collected samples.13
Prevalence by geographical area and ethnicity
The allele and genotype distribution, by area and ethnicity, is presented in table 1. The prevalence of the homozygous TT genotype (two 677C>T alleles) is visually summarised in fig 1 for those groups with at least 50 subjects tested.
The distribution of the 677C>T allele showed regional and ethnic variations. For example, the prevalence of the homozygous TT genotype was 10–12% in several areas in Europe (for example, Spain, France, and Hungary). However, the prevalence appeared to be lower (4% and 6%, respectively) in Finland, Helsinki and the northern Netherlands, whereas in some areas in southern Europe it was much higher (26% and 20% in Campania and Sicily, respectively). In the Americas, the frequency of the homozygous TT genotype was higher in Mexico (32%), intermediate in Atlanta (11% among whites), and somewhat lower in Alberta (6%). In Australia, TT prevalence was 7.5% among whites.
Genotype varied by ethnicity as well as by geographical location. For example, TT homozygosity was more common among newborns from Mexico or those born in Atlanta of Hispanic origin, intermediate among newborns of European ancestry (for example, in Europe and North America), and lower among newborns of African ancestry (for example, in Atlanta and Veneto, Italy). However, a range of genotype frequencies was evident even within broad ethnic groups. For example, TT homozygosity among whites ranged from as low as 6% in Alberta (Canada), to 7.5% in New South Wales (Australia), to 11% in Atlanta (USA), to the high values already noted for Italy. For other ethnic and racial groups, such estimates are more unstable because of the smaller number of specimens, but it is worth noting an apparently low frequency of TT homozygosity among newborns of Asian origin from Australia and Atlanta.
The observed distribution of the three genotypes (CC, CT, TT) in most areas was similar to that expected under Hardy-Weinberg equilibrium (table 2). This was true also among males and females separately (data not shown). We found a relative excess of TT homozygotes in Campania, Italy, and an excess of heterozygotes in Helsinki, Finland and southern China (p<0.05).
We documented distinctive geographical and racial/ethnic variation in the prevalence of the 677C>T allele of the MTHFR gene among a large international sample of newborns. The several fold variation in the prevalence of the TT homozygous genotype across the study areas (fig 1) was also consistent, in some areas, with the presence of geographical gradients. In Europe, for example, the prevalence of the TT genotype increased in a roughly southerly direction, from low values in the north (4–7% in Finland, Helsinki, northern Netherlands, and Russia), to intermediate values (8–10%) in France and Hungary, to higher values in southern Europe (12–15% in Spain and northern Italy), peaking in southern Italy (20–26% in Campania and Sicily). In North America, the frequency of TT homozygotes increased from western Canada (Alberta) to south eastern United States (Atlanta) and peaked in Mexico.
Ethnic variation was apparent among and within geographical areas. In metropolitan Atlanta, for example, TT homozygosity was common among newborns of Hispanic origin (15%), intermediate among those of European origin (11%), and low among African-American newborns (3%). These data are consistent with the high prevalence of TT homozygosity among newborns from Mexico in this study and with published data from the population based sample of babies of Mexican ancestry from California.14 The low prevalence among US blacks is similar to that reported in pooled estimates of five studies on US blacks and three studies from sub-Saharan Africa3 as well as in later studies from South Africa and Zimbabwe.15,16 The intermediate prevalence among whites in Atlanta is consistent with similar rates observed in several European areas in this and several other studies.3 However, more detailed comparisons are difficult because of the misclassification and imprecision of such ethnic labels.
Of note is the finding in Australia of a lower prevalence of the TT genotype among whites (7.5%) compared to previous reports.17 Also, we noted a relatively low prevalence of the TT genotype (5.8%) among the random sample of white newborns in Alberta (Canada), compared to the frequency (11%) reported in a previous study from Quebec (Canada).18 The latter study differed from ours in that newborns were enrolled from a single university hospital in Montreal and were selected, by design, so that their birth weights were at or above the 10th centile.18
The high frequency of TT homozygotes observed in this study among newborns from Mexico, northern China, and southern Italy was notable. These findings confirm and extend those previously reported from Mexico19 and southern Italy.20 Why such high rates of TT homozygosity occur in these regions is unclear, given the apparently limited ethnic, genetic, or environmental commonalities among such areas. Researchers have suggested the possibility of heterozygote advantage with respect to the risk for neural tube defects.21 However, such a hypothesis remains unconfirmed. Nevertheless, further exploration of gene-gene and gene-environment interaction might help to identify the evolutionary pressures favouring a high prevalence of this gene variant in certain areas and ethnic groups.
The impact of such geographical and ethnic variation on the distribution of disease in the population is unclear. For example, one would predict high rates of neural tube defects, whose risk appears to be increased nearly two-fold in the presence of 677C>T homozygosity3 in those geographical areas or ethnic groups with a high frequency of this genotype. The evidence supporting such relations is mixed. For example, the data are consistent for Mexico and northern China, which not only have a very high frequency of the TT genotype but also high rates of neural tube defect.22,23 Furthermore, within China, rates of neural tube defect are higher in the north (where the TT homozygous genotype is more common) than in the south.23 In the United States, the rates of neural tube defects historically have been higher among Hispanics, intermediate among non-Hispanic whites, and lower among African-Americans, a trend that follows the relative frequency of the TT homozygous genotype.
There are, however, notable exceptions. In southern Italy, for example, the TT genotype is common, but the rate of neural tube defects is not particularly high.22 Nevertheless, such exceptions are not entirely unexpected, because environmental and nutritional factors are likely to modulate considerably the genetic risk for neural tube defects. In fact, these exceptions might prove particularly valuable when investigating the aetiological heterogeneity and the role of interactions in the occurrence of neural tube defects.
Similar analyses are possible with respect to other outcomes. For example, recent meta-analyses showed associations of the TT genotype with ischaemic heart disease, deep venous thrombosis, and perhaps stroke.24 Like neural tube defects, these health outcomes are subject to interacting risk factors and therefore the relation between genotype and outcome at a group level is likely to be complex. Nevertheless, researchers seeking to understand such relations might find data such as these on the geographical and ethnic variation of the 677C>T allele helpful.
On a population level, the genotype distribution associated with the T allele was generally consistent with Hardy-Weinberg expectations. However, a few significant deviations did occur, mostly at the ends of the frequency spectrum. An excess of TT homozygotes was observed in southern Italy (Campania), where the allele was common, whereas the reverse was observed in Finland (Helsinki) where the allele was uncommon. Though these two deviations from Hardy-Weinberg expectations could be the result of chance and multiple statistical testing, they might also suggest the presence of local selective pressures.
In interpreting the findings of this study, one should consider its strengths and limitations. Although we attempted to draw unbiased, systematic samples of newborns from defined populations, sampling strategies varied across areas, and one cannot be certain that the efforts were always entirely successful. We provide details on sampling procedures as guidance to readers who wish to use part or all of these data. Dealing with race and ethnicity was also a difficult but inescapable challenge. Classifications based on self report and particularly on the birth place of the parent or last name are unsatisfactory to varying degrees. Thus, we present our data (table 1) either stratified in two groups (the main ethnic group and all other groups combined), or present data only for the major ethnic group (for example, Han Chinese). While this approach does not solve the difficulties entirely, it decreases the misclassification inherent in defining the many smaller ethnic groups that coexist in many areas. Other limitations of this study include the lack of coverage from many areas of the world, including most of Africa, the Middle East, Latin America, and the Indian subcontinent.25
Another challenge of this study was addressing measurement error in genotyping. One might speculate, for example, that deviations in Hardy-Weinberg equilibrium may be the result of genotyping errors. However, inter-laboratory consistency and quality control measures showed remarkable agreement among laboratories. In addition, the same laboratory that assayed samples from areas showing deviations from Hardy-Weinberg equilibrium also assayed the samples from many areas not showing such deviations, suggesting no systematic laboratory error.
A strength of the study was the ability to assemble systematically relatively large samples from newborns using explicit sampling protocols. Measures were also taken to ensure the reliability and comparability of genotypic data across laboratories, including quality control protocols that involved blind retesting of results and exchange of specimens.
Data from studies such as these can serve several purposes. Geneticists could find them useful when evaluating the distribution of genetic variation in human populations and its role in genetic susceptibility to disease. For example, population data might help geneticists reassess controversial associations such as that between MTHFR genotypes and risk for Down syndrome,26–29 for which the evidence favouring the association was largely derived from comparisons with convenient samples of controls. As discussed previously, these genetic data can help to interpret prevalence gradients of disease, such as the well known geographical gradients of neural tube defect occurrence. Similarly, huge amounts of data on other outcomes, such as other birth defects, pregnancy complications, certain cancers, adult cardiovascular disease, and certain psychiatric disorders,3–8 could be called upon to interpret the prevalence gradients noted in this and other studies. Our data are offered as a contribution to such investigation.
Population data on the 677C>T variant might also help population and public health geneticists assess the potential impact of preventive measures based on environmental modifications. For example, some adverse biochemical effects of the thermolabile enzyme coded by the T allele, such as the increase in total plasma homocysteine, appear to be reversible by increasing the consumption of the B vitamin folic acid.30 If the effect of folic acid varies by genotype, then the overall impact in the population of fortification or supplementation programmes might vary predictably once the genotype distribution is known.
Finally, a practical outcome of this collaborative study was to show the feasibility of conducting such genetic surveys using existing networks of hospitals, birth defect registries, and research institutions. Other research groups have carefully selected and examined large and representative samples of newborns from single states or countries (for example, California14 and Ireland13) and generated genotype prevalence data. We tried to expand such efforts to an international scale, and suggest that, with appropriate planning, such international networks can use their access and experience in community based studies to provide core data on the population distribution of common gene variants. These data in turn can serve as the foundation for studies of genetic variation and its role in increasing or decreasing disease risk.
The study was supported in part by grant U50/CCU207141 from the US Centers for Disease Control and Prevention. The work by Generoso Andria (Napoli) was partially supported by grants No 97.03983, 98.02936, and 99.2368 from CNR Rome. We also gratefully acknowledge the contribution of the following researchers: Australia: Paul Lancaster; Canada (Alberta): Brian Lowry; China: Wanyin Shen, Youjun Gao, Fei Deng, Chunna He, Shuqin Zhang, Yabin Liu, Guangfeng Ju, Hua Dong, Zhongyue Yuan, Dongping Ye, Ping Bai, Yuqin Zhang, Li Jin, Qian Gao; Hungary: László Tímár (National Centre of Health Promotion, Budapest); Israel: Paul Merlob (Beilinson Medical Center, Petah Tikva); Italy (Campania): Roberta Arsieri and Carmela Cafasso (Birth Defects Register of Campania); Italy (Napoli): Roberto Brancaccio and Anna Buoninconti (Department of Paediatrics, Federico II University, Napoli); Italy (Veneto): Luciano Marcazzo’ (Paediatrics Unit, Arzignano Hospital, Vicenza); Mexico: Marcela Vela (Reproductive Health Agency of Ministry of Health); Russia (Moscow): Swetlana Kalinenkova (Neonatal screening programme); Spain: P Aparicio, F Ariza, I Arroyo, A Ayala, F Barranco, M Blanco, J M Bofarull, M J Calvo, R Calvo, A Cárdenes, S Castro, C Contessotto, M T Cortés, F Cucalón, J Egüés, M J Espinosa, V Felix, E Fernández, A Foguet, J M Gairi, E Galán, A García, M J García, M M García, J L Gomar, H Gómez, F Gómez, J Gómez-Ullate, J González de Dios, P Gutiérrez, F Hernández, H Huertas, N Jiménez-Muñoz, M M Lertxundi, A Lara, J A López, E Mancebo, J J Marco, A Martínez, N Martínez, G Martínez, S Martínez, V Marugán, C Meipp, A Moral, M C Morales, A Moussallem, I Mújica, M J Oliván, L Paisán, M Pardo, A Peñas, J L Pérez, I Puig, I Riaño, C Ribes, M J Ripalda, J Rosal, L Rota, J Rubio, A Sanchis, M Silveira, M E Suárez, J M de Tapia, M C Tauler, L Valdivia, M S Vázquez; USA (Atlanta, Georgia): Jennifer Rapier and Vicki Brown (CDC), Muthukrishnan Ramachandran (Georgia Public Health Laboratory, Atlanta).
Review history and Supplementary material
Please note that there is an error in the author list, the name of Dr Renlund is spelt incorrectly. The correct author list is shown here:
B Wilcken, F Bamforth, Z Li, H Zhu, A Ritvanen, M Renlund, C Stoll, Y Alembik, B Dott, A E Czeizel, Z Gelman-Kohan, G Scarano, S Bianca, G Ettore, R Tenconi, S Bellato, I Scala, O M Mutchinick, M A Lopez, H de Walle, R Hofstra, L Joutchenko, L Kavteladze, E Bermejo, M L Martinez-Frias, M Gallagher, J D Erickson, S E Vollset, P Mastroiacovo, G Andria, and L D Botto
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.