Article Text


Evidence for a gene influencing haematocrit on chromosome 6q23–24: genomewide scan in the Framingham Heart Study
  1. J-P Lin1,
  2. C J O’Donnell2,3,
  3. D Levy2,4,
  4. L A Cupples5
  1. 1DECA/NHLBI/NIH, 6701 Rockledge Dr, Suite 8110, Bethesda MD 20892, USA
  2. 2NHLBI/Framingham Heart Study, 73 Mount Wayte Avenue, Suite 2, Framingham, MA 01702, USA
  3. 3Cardiology Division, Massachusetts General Hospital, Boston, MA 02114, USA
  4. 4Cardiology Division, Boston University School of Medicine, Boston, MA 02115, USA
  5. 5Department of Biostatistics, Boston University School of Public Health, 715 Albany Street, TE422, Boston, MA 02118, USA
  1. Correspondence to:
 Dr J-P Lin
 NHLBI/NIH, 6701 Rockledge Dr, Suite 8110, Bethesda, MD 20892–7938; USA;

Statistics from

For more than 40 years, a number of studies have revealed that high haematocrit (HCT) levels are associated with increased risk for cerebrovascular disease,1–3 cardiovascular disease (CVD),4–6 peripheral vascular disease,7–9 and all cause mortality.5,10,11 During 34 years of follow up in more than 5200 individuals, a Framingham investigation demonstrated that increased HCT was significantly associated with increased risks for CVD, coronary heart disease, and myocardial infarction in both men and women.5 A significant increase in all cause mortality for individuals with very low or high HCT was also observed.5 Although HCT levels were related to other vascular risk factors, the risk associated with an elevated HCT persisted after accounting for other risk factors for cardiovascular and cerebrovascular events, and for all cause mortality.5,12

HCT is the percentage of whole blood that is comprised of red blood cells, and is a compound measure of red blood cell number and size. HGB is an abundant protein within red blood cells and serves as the main oxygen carrying component of red blood cells, therefore HCT and HGB are strongly correlated. From the rheological viewpoint, blood viscosity depends largely on HCT value. There is an inverse relationship between viscosity and vascular blood flow;13 high HCT hampers organ perfusion.

Twin studies in healthy humans have suggested that HCT variation is partly determined by genetic factors with heritability estimated at 40−65%.14–16 A number of gene products are known to be involved in erythropoiesis, most notably erythropoietin. However, the genes that determine an individual’s normal HCT level in the general population are unknown. A genome scan to map genes controlling HCT in the spontaneously hypertensive rat indicated a significant association between a marker on chromosome 4 and the observed variability of HCT.17 No association was found between HCT and erythropoietin, which was mapped to chromosome 12 in rat.17 So far, no linkage analysis of HCT in humans has been reported. We thus report one of the first linkage studies of HCT in the Framingham Heart Study, with the goal of identifying chromosomal regions that may contain quantitative trait loci (QTL) involved in controlling HCT. Because HCT and HGB are strongly correlated, we also carried out a genome scan on HGB. Finally, we conducted a bivariate linkage analysis of HCT and HGB.


The Framingham Heart Study, a population based study, began in 1948 with the recruitment of 5209 residents aged 28–62 years (mean age 44.1) from Framingham, Massachusetts.18 The participants have undergone biennial examinations since the study began. In 1971, the Framingham Offspring Study19 was started, in part to evaluate the genetic components of cardiovascular disease aetiology. In total, there were 5124 subjects aged 5–70 years (mean age 36.3) including the offspring of the original cohort and the spouses of the offspring. The offspring cohort has been examined every 4 years (except the first two examinations, with 8 years intervening). Within the study, the 330 largest extended families were selected for a 10 cM density genomewide scan (399 markers). The number of subjects genotyped was 1702. We used measurements from offspring cohort examination 1 and original cohort examination 12 for our genome scan. Both examinations were conducted in the early 1970s. As HGB was measured at offspring cohort examination 1 but not at original cohort examination 12, we could only carry out a genome scan on HGB using the offspring cohort. For a better comparison, we also carried out a genome scan on HCT using the offspring cohort only. Finally, we carried out a bivariate genomewide linkage analysis of HCT and HGB in the offspring cohort only.

Key points

  • Elevated haematocrit (HCT) levels are associated with increased risk for vascular diseases. We carried out genome scans for quantitative trait loci (QTL) on HCT and on haemoglobin (HGB), a correlated trait.

  • The heritabilities were estimated as 41% for HCT and 45% for HGB.

  • The genomewide linkage analysis revealed evidence of significant linkage for HCT to chromosome 6q23-24, with a lod score of 3.4 at location 136 cM. Only one other region in the genome, chromosome 1, produced a multipoint lod score >1.5 (lod for this region was 1.7).

  • The results of the HGB genome scan were different from those of HCT; there was no evidence of linkage for HGB to the same chromosome 6q region.

  • Bivariate linkage analysis also did not support QTL pleiotropy in this chromosome area; however, bivariate analyses provided evidence of QTL pleiotropy to chromosome 9q, with a lod score of 3.1 at location 149 cM.

  • We conclude that chromosome 6q may harbour a gene that is specific to HCT but not HGB, whereas a shared gene for both traits may lie on chromosome 9q.

HCT was measured by the Wintrobe method. Blood was collected and spun at 5000 rpm for 20 minutes in a balanced oxalate tube. The percentage of total blood volume due to red blood cells was determined visually against a calibrated scale. Subjects were weighed in light clothing and with shoes removed. The average number of cigarettes smoked per day over the prior year was based on self reports. Alcohol consumption was reported by subjects as their usual number of drinks per day and converted to fluid ounces/week for analysis. Laboratory measurements were made on 12 hour fasting venous blood samples that were collected in tubes containing 0.1% EDTA. Lipid determinations were performed at the Framingham Heart Study laboratory, which participates in the Standardization Program of the Centers for Disease Control. All subjects provided informed consent prior to each clinic visit and the examination protocol was approved by the Institutional Review Board at Boston Medical Center, Massachusetts. The clinical and laboratory methods have been detailed elsewhere.18

Genomic DNA was isolated from nucleated blood cells. DNA samples were sent to the Marshfield Mammalian Genotyping Service ( At an average 10 cM density, 399 microsatellite markers (screening set 9)20 covered the genome, with an average marker heterozygosity of 0.77. The genotyping data were cleaned with two steps. Firstly, the sibling kin program in Aspex ( was used to verify family relationships based on all markers available. Secondly, the GENTEST program, as a precursor of INFER, created by Southwest Foundation for Biomedical Research ( was used to identify and eliminate additional genotype inconsistencies. When inconsistencies were found, the genotyping values in all members of the nuclear family were set to missing.

Variation in HCT from known factors was identified and removed by regression modelling incorporated in SOLAR, to enhance the ability of linkage analysis to detect genetically determined variation using a maximum likelihood based variance decomposition method.21,22 The covariates selected (p<0.05) and incorporated into both the heritability estimation and the linkage analyses were age, sex, weight, total cholesterol, high density lipoprotein (HDL) cholesterol, triglyceride, diabetes, smoking (number/day), and alcohol intake (ounces/week). After adjustment for covariates, the variance component method used the residual variation for heritability estimation and linkage analyses. The same covariates were applied for HCT and HGB using the offspring cohort examination 1 genome scan data and bivariate genome scan on HCT and HGB.

An estimate of heritability was obtained using the variance component method. Residual heritability is the proportion of total phenotypic variation due to additive genetic effects, after removing the variation attributable to covariates. Variance component linkage analysis was used for the linkage analysis between random DNA markers covering the entire genome and HCT or HGB (or HCT and HGB for bivariate analysis), adjusted for known covariates. This approach made use of all information in pedigrees of any size and structure. Marker allele frequencies were estimated from the study participants and then used to estimate the proportion of shared alleles that were identical by descent (IBD) among all relative pairs. A likelihood ratio test was used to evaluate linkage by comparing a purely polygenic model (without consideration of genetic marker information) to a model that incorporates IBD information at the marker. The lod score was the log10 of the ratio of the likelihoods of two models, one purely polygenic versus one that also included IBD information at the marker. Because bivariate linkage analysis may improve both power and localisation of a shared gene for correlated quantitative traits,23 we carried out a bivariate genome scan on HCT and HGB. For bivariate analysis, lod scores are reported with one degree of freedom; equivalent lod scores are comparable to univariate lod scores. For evaluation of significance, we used the genomewide significant p values suggested by Lander and Kruglyak.24 The total phenotypic correlation between HCT and HGB was estimated taking the family structure into account:

(rp = (h1×h2×rg)+[(1-h1)1/2(1−h2)1/2×re])

where rp, rg, and re are total phenotypic, genetic, and environmental correlations, and h1 and h2 are the square roots of heritabilities of HCT and HGB.


The total number of individuals with measured HCT and with all covariates (age, sex, weight, smoking, alcohol consumption, total cholesterol, high density lipoprotein cholesterol, triglycerides, and diabetes) used for the heritability estimates and linkage analysis in the original cohort and offspring was 2278 (the actual number of individuals considered in linkage analysis). The total number of individuals in the offspring cohort only with measured HCT and HBG and all covariates was 1444. The mean values of the clinical covariate of those individuals (about 50% male) are displayed in table 1. Of the 2278 individuals from the original and offspring cohorts, 1524 had marker genotypes; these included1323 full sibling pairs, 52 half sibling pairs, 645 cousin pairs, and 354 avuncular pairs. Among the 1444 individuals in the offspring cohort, 1213 (1124) individuals had HCT (HGB) and all covariates measured and had genotypic data available, including 1245 (1109) full sibling pairs, 52 (50) half sibling pairs, 631 (522) cousin pairs, and 78 (62) avuncular pairs.

Table 1

  Characteristics of the 2278 individuals with HCT and all covariates measured (offspring and original cohort) and 1444 individuals with HCT, HGB, and all covariates measured (offspring only) used in the linkage analysis

The mean (SD) estimates of skewness and kurtosis of the HCT distribution (including offspring cohort examination 1 and original cohort examination 12) were 0.033 (0.028) and 0.283 (0.055), respectively. As a rough measure, normality could not be rejected.25 Bivariate genetic analysis indicated that HCT was both genetically and environmentally correlated with HGB (rg = 0.85, p = 0.02; re = 0.97, p<0.01). The genetic correlation, rg, was significantly different from 1 (p<0.01). The total phenotypic correlation coefficient between HCT and HGB (offspring examination 1 only) was 0.95 (p<0.01).

The heritability estimate for HCT, after adjusting for the covariates, was 41%, indicating that a substantial portion of the variation in HCT was attributable to additive genetic factors. The proportion of variance due to all covariates included in the model (age, sex, weight, total cholesterol, HDL cholesterol, triglyceride, diabetes, cigarettes per day, and alcohol intake) was approximately 40%. The heritability estimate for HGB, after adjusting for the same covariates, was 45%, and the proportion of variance due to the covariates was 49%.

For multipoint linkage analysis of HCT, a maximum lod score of 3.4 (genomic p = 0.02) was observed on chromosome 6q23–24, with the peak location at 136 cM (fig 1). Other than this, there was only one region in the genome, chromosome 1, at 132 cM, with a maximum multipoint lod score of 1.5 or higher (lod score = 1.7) (table 2). From a simulation study, the power to detect a QTL effect size of 20% for lod scores of 2.0, 1.5, and 1.0 or higher was ∼67%, ∼80%, and ∼90%, respectively. Therefore, we decided to report the results with a lod score >1.5.

Table 2

 Chromosome regions in genome scans with multipoint lod ⩾1.5

Figure 1

 Multipoint lod scores for HCT on chromosome 6: x axis values are cM; y axis values are multipoint lod scores. Linkage analyses were conducted using multivariable residuals, adjusted for age, sex, weight, total cholesterol, high density lipoprotein cholesterol, smoking and alcohol intake, triglycerides, and diabetes.

Evidence for HCT linkage on chromosome 6 occurred in the region bounded by markers GATA23F08 and GATA32B03. The lod–1 supporting interval (the region corresponding to maximum lod score minus 1) for the HCT QTL spans a 11 cM interval flanked by markers GATA31 and GATA184A08, which are 27 cM apart. When the HCT genome scan was restricted to offspring cohort examination 1 only, the results were similar to those obtained for the combined samples of both cohorts (table 2).

For the HGB genome scan at offspring cohort examination 1, there was little evidence for linkage of HGB to the chromosome 6q region (136 cM, lod = 0.37). There were several regions with lod scores ⩾1.5 that did not overlap with the findings of the HCT genome scan, except on chromosome 1 (table 2).

In addition to the regions on chromosomes 1 and 6, a bivariate genome scan for HCT and HGB also identified regions on chromosomes 7, 9, 14, and 19. The first three regions were identified in the HGB univariate genome scan also (table 2). While the evidence suggests that the QTL on chromosome 6 may be unique to HCT, we cannot exclude the possibility of QTL pleiotropy in this region using a comparison of the likelihood of the model restricting the QTL correlation to zero to the model in which the QTL correlation was estimated (data not shown). In contrast, the bivariate analysis revealed a substantial increase in the lod score on chromosome 9 from 1.6 to 3.1 (genomic p = 0.045) with a change in position from 136 cM to 149 cM, suggesting pleiotropy in this region for HCT and HGB. The QTL pleiotropy test on 9q region was significant (p>0.05, data not shown).


Our results revealed significant evidence for linkage of HCT to chromosome 6q23–24 with a lod score of 3.4 using the variance component method, after adjustment for covariates. These results provide evidence of a possible HCT QTL on chromosome 6q23–24.

In the genome scan on HGB, there was minimal evidence for linkage of HGB to the chromosome 6q area. While HCT and HGB are strongly correlated genetically and environmentally, the only common locus in the results of the two genome scans that examined these measures individually was on a chromosome 1 area (132 cM) at which both traits had a lod score suggestive of linkage.

Of import, the lod score in the bivariate analysis increased substantially in the chromosome 9q region with a significant genomewide p value. Therefore, the region on chromosome 9 may harbour a gene related to both HCT and HGB, while the one on chromosome 6 appears to be more related to HCT. The lod scores in the remaining locations did not reach genomewide significant levels. Our results support the hypotheses that there may be both a major locus on chromosome 9 and polygenic pleiotropy for HCT and HGB.

Within the region of chromosome 6q there are a few candidate genes, including EBP41L2, coding for protein 4.1G, which is a member of the erythrocyte membrane skeletal protein 4.1R (EPB41) gene family,26 and HEBP2, which codes for a putative haem binding protein (138 cM). One form of hereditary persistence of fetal haemoglobin (HPFH) has also been mapped to this area.27,28

Erythropoiesis, the production of red cells by bone marrow, is regulated by the hormone erythropoietin, which binds to the erythropoietin receptor to regulate bone marrow erythroid cell proliferation. The erythropoietin gene maps to the human chromosome 7q2129 while the erythropoietin receptor maps to chromosome 19p13.30 We did not find evidence suggestive of linkage to these regions.

The HCT is the proportion of the blood that consists of red blood cells, and the principal components of the red cells are HGB and membrane. The HGB tetramer consists of two pairs of globin polypeptide chains coded by globin genes located in two clusters: α-like genes mapping to the chromosome 16pter-p1331 and β-like genes to the chromosome 11p15.5.32 A single molecule of haem is attached to each polypeptide chain. The major haemoglobin of intrauterine life is fetal HGB, or HbF. In adult blood, fetal haemoglobin is present in only a very small number of red cells, the F cells. The predominant HGB of postnatal or adult life is haemoglobin A, which consists of two α-globin and two β-globin chains. Mutations in α-globin and β-globin genes cause thalassaemias, sickle cell anaemia, methaemoglobinaemias, and erythraemias. No evidence of linkage was found in these two globin gene cluster areas.

Red cell membrane is composed of a lipid bilayer anchored to a network of proteins, the membrane skeleton, which is important for maintaining red cell shape and regulating membrane properties of deformability. Mutations in genes coding the membrane skeleton cause red cell morphological disorders, such as hereditary spherocytosis and hereditary elliptocytosis. Those genes include spectrin α-chain located on the human chromosome 1q21,33 spectrin β-chain 14q23–24,34 ankyrin 8p11,35 band 3 17q21–22,36 protein 4.1R 1p36–34,37 and protein 4.2 15q15.38 We further confirmed that there was no evidence suggestive of linkage in these chromosome regions.

Protein 4.1R plays an important role in maintaining erythrocyte shape and membrane mechanical properties. Mutations in the protein 4.1R gene cause hereditary elliptocytosis. Recently, another protein 4.1 gene, protein 4.1G, a close homologue of protein 4.1R, was mapped to the human chromosome 6q23,25 where the peak maximum lod score of our HCT genome scan is located. This gene is widely expressed in human tissues. A study indicated that the two proteins, 4.1R and 4.1G, may play fundamental analogous roles but at different intracellular sites.26 The detailed function of the protein 4.1G has not yet been reported. Further studies may be warranted to explore whether variation in these and other candidate gene in the linkage region are associated with HCT and related phenotypes.

As there was minimal evidence for linkage of HGB to the chromosome 6q23–24 area, this may imply that the 6q area harbours a gene specific to HCT but not HGB. In addition to HGB, the key component of HCT is red cell membrane. It may be worthwhile to pay special attention to membrane genes as playing a role in our genome scan findings.

A limitation of our study was that our cohort is largely white. Therefore, caution is advised in extrapolating our results to other ethnic groups.

Overall, we found significant evidence for linkage of HCT to chromosome 6q23–24 that harbours several positional candidate genes and significant evidence for joint linkage of HCT and HGB to chromosome 9q34, although we did not find an important candidate gene in this 9q region. Our results support pursuit of fine mapping and association studies between HCT (HGB) and positional candidate genes in these chromosomal regions.


We thank Dr T Tao (National Library of Medicine) for his help in carrying out the candidate gene search.


View Abstract


  • This work was supported by the National Heart, Lung, and Blood Institute’s Framingham Heart Study (Contract No. N01-HC-25195).

  • Competing interests: none declared

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.