Article Text

Download PDFPDF

The HBS1L-MYB intergenic region on chromosome 6q23 is a quantitative trait locus controlling fetal haemoglobin level in carriers of β-thalassaemia
  1. C-C So1,
  2. Y-Q Song2,
  3. S T Tsang1,
  4. L-F Tang2,
  5. A Y Chan1,
  6. E S Ma3,
  7. L-C Chan1
  1. 1
    Department of Pathology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
  2. 2
    Department of Biochemistry, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
  3. 3
    Department of Pathology, Hong Kong Sanatorium and Hospital, Hong Kong SAR, China
  1. Dr C-C So, Department of Pathology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China; scc{at}


Background: Fetal haemoglobin (HbF) level modifies the clinical severity of HBB disorders. Intergenic variants of HBS1L-MYB on chromosome 6q23 have recently been shown to be a major quantitative trait locus (QTL) influencing HbF levels in normal Caucasian adults.

Methods: A unique and well-characterised cohort of 238 Chinese subjects with β-thalassaemia trait was used to conduct a single-nucleotide polymorphism (SNP) association study for HbF level.

Results: Within this locus, 29 trait-associated SNPs in a non-coding 56 kb segment were identified. They were divided into five linkage disequilibrium (LD) blocks in the Chinese participants.

Conclusions: The data independently validate for the first time the significance of the HBS1L-MYB intergenic region in regulating HbF expression in a separate ethnic group that has a high prevalence of β-thalassaemia. Functional studies to unravel the biological significance of this region in regulating HbF production is clearly indicated, which may lead to new strategies to modify the disease course of severe HBB disorders.

View Full Text

Statistics from

The β-globin disorders are among the commonest mendelian disorders in humans. Patients with identical HBB mutations can differ in clinical severity, due to disease modifiers both within and outside the β-globin gene cluster, such as concomitant α-thalassaemia and fetal haemoglobin (HbF) (α2γ2) level. Those having a higher HbF level present with less severe disease.

Within the β-globin gene cluster at 11p15, there are several genetic determinants that are associated with increased HbF in adult life. They are collectively termed hereditary persistence of fetal Hb (HPFH) and are inherited in a mendelian fashion, caused either by large deletions of the β-globin gene cluster or point mutations of the two γ-globin gene promoters. These genetic changes, however, are rare and play only a small role in the modification of the β-thalassaemia phenotype. In contrast, the XmnI-Gγ polymorphism at promoter nucleotide -158 (C→T) is more common. There is ample evidence that under conditions of increased erythroid stress (eg, in homozygous β-thalassaemia and sickle-cell disease), the presence of the XmnI-Gγ polymorphism favours a higher HbF response.1 Some β-thalassaemia alleles may themselves be associated with higher HbF output. These include mutations or deletions that involve the HBB promoter, an observation that may reflect the competition between the HBG1, HBG2 and HBB promoters for transcription factors and for interaction with the upstream locus control region.

There are also genetic determinants responsible for enhanced HbF output, which are unlinked to the β-globin gene cluster. Twin studies have shown that around 90% of variation in adult HbF level is genetically controlled,2 and 50–60% of this variation is not linked to the β-globin gene cluster on chromosome 11.3 In a study on several generations of a large family in which a genetic change segregated independently of the β-globin cluster, the locus involved was assigned to chromosome 6q.4 Subsequently, in the same family, another quantitative trait locus for F-cell (adult red cells with high HbF content) levels was assigned to chromosome 8q, the effects of which are conditional on the XmnI-Gγ polymorphism.5 A high-resolution SNP association study at the 6q23 region identified multiple genetic variants that are strongly associated with F-cell levels in normal Caucasians.6 The importance of the 11p15 and 6q23 regions as quantitative trait loci in influencing F-cell production was further shown in a recent genome-wide association study using >300 000 markers, in which an additional QTL at 2p15 was discovered.7

The β-thalassaemias and certain β-haemoglobinopathies (notably HbE) are prevalent in southeast Asia. The spectrum of HBB mutations differs among different ethnic groups. We previously performed a population study that showed that asymptomatic carriage of β-thalassaemia is up to 3% in Hong Kong.8 It is therefore important for us to study genetic modifiers of β-thalassaemia and the regulatory mechanism of HbF production in our population. To assess the role of the 6q23 QTL in HbF modulation in Chinese people, we performed an SNP association study to look for any genetic variants associated with HbF level both in normal subjects and in carriers of β-thalassaemia. We identified 29 SNPs that are associated with HbF level in subjects with β-thalassaemia.


The study was approved by the institutional review board of Queen Mary Hospital, Hong Kong (IRB Ref. No. UW 07–151). All participants in the control group gave their written informed consent.

Subjects and phenotyping

In total, 238 samples from Chinese subject (102 male, 136 female subjects, male:female ratio 0.75:1, mean age 28 years, range 1 to 95). were retrieved from our archive for study. The patients had been referred for investigation of suspected thalassaemia and subsequently confirmed to have β-thalassaemia trait. Samples were collected over a 9 year period between 1998 and 2006. The control group comprised 93 medical students (53 men, 40 women, male:female ratio 1.33:1, mean age 20 years) recruited prospectively in 2007.

Peripheral blood samples were collected into vials containing EDTA. Haematological data and haemoglobin pattern including HbF level were fully characterised. Blood counts including white cell count with differential counts, red cell indices and platelet count were performed by one of two automated blood cell analysers in routine service, which were closely correlated (Advia 120, Bayer Corporation, Newbury, Berkshire, UK or GENS, Beckman Coulter, Miami, Florida, USA). Hb A2 and HbF levels were assayed using cation-exchange high-performance liquid chromatography (between 1998 to June 2003, Variant, and between July 2003 to 2006, Variant II; both BioRad, Hercules, California).

Globin genotyping

Adequate DNA was available for globin genotyping in 229 of the 238 β-thalassaemia trait carriers and 92 of the 93 medical students.

The α-globin gene deletions were determined by multiplex gap PCR as described for detection of --SEA, -α3.7 and –α4.2.9 Three common non-deletional α2-thalassaemia determinants, Hb Constant Spring (Hb CS), Hb Quong Sze (Hb QS) and codon 30 (ΔGAG), were screened by a multiplex amplification refractory mutation system (ARMS) as described.10 The triplicate α-globin gene was detected by multiplex PCR as previously described.11

Mutations in HBB were screened by PCR-based diagnostic strategies, including one using heteroduplex to detect deletion of CTTT in codons 41 and 42. Multiplex ARMS was used to detect IVSII-nt 654 (C→T), nt -28 (A→G), codon 17 (A→T), codon 43(G→T) and codons 71/72 (+A) as previously described.12 Direct nucleotide sequencing was used to detect other mutations (3730xl Analyzer; Applied Biosystems, Foster City, California, SA).

The detection of the XmnI-Gγ polymorphism at nucleotide −158 (C→T) was carried out using XmnI restriction-enzyme digestion after PCR amplification of the Gγ promoter.13

Single-nucleotide polymorphism genotyping

In the first round of SNP genotyping, 22 tag-SNPs covering a region 287 kb between MYB and HBS1L on chromosome 6q were selected from the HapMap database for Chinese populations ( In the second round of fine mapping, 48 SNPs selected from the HapMap database and recently published data in the 6q23 region6 were genotyped, spanning 79 kb in the HBS1L-MYB intergenic region. Details of SNPs selected are shown in table 1. Genotypes were determined using a commercial assay (Homogeneous Mass EXTEND assay; Sequenom, San Diego, California, USA). After PCR amplification, non-incorporated dNTPs were removed by shrimp alkaline phosphatase. A detecting primer immediately upstream from the polymorphic site was added together with a specific combination of deoxy dTTP and di-deoxy dATP, dCTP, dGTP, and thermosequenase (Amersham, Bioscience, Piscataway, USA). The extension products were then analysed by mass spectrometry (Mass Array System; Sequenom). Genotyping call rate for batch 1 and 2 is 95.0% and 95.7%, respectively. Markers violating Hardy–Weinberg equilibrium with a p value <0.001 or with a minor allele frequency <0.01 were excluded from subsequent analyses.

Table 1 Marker identities, chromosomal positions, genotyping results and association with fetal haemoglobin levels in carriers of β-thalassaemia trait in the first and second rounds of single-nucleotide polymorphism genotyping

Sequencing of an evolutionarily conserved segment within the HBSIL-MYB intergenic region

A 3 kb conserved segment containing the two SNPs most significantly associated with HbF level (rs9483788 and rs6934903) in the HBS1L-MYB intergenic region was sequenced (3730xl Analyzer; Applied Biosystems) in 10 carriers of the β-thalassaemia trait with a high HbF rs9483788 genotype (C/C) and another 10 carriers of the β-thalassaemia trait with a low HbF rs9483788 genotype (T/T).

Statistical analysis

Haemoglobin F values were log-transformed to adjust for positive skewing. Single marker association and haplotype association with HbF levels was evaluated with Purcell’s PLINK program.14 Multiple testing correction was evaluated using the step-up false discovery rate control of Benjamini and Yekutieli15 and was done inside PLINK. D′ was used as a measure of linkage disequilibrium and computed with Haploview V.4.016 using the default algorithm from Gabriel et al.17


Red cell phenotypes

Data on red cell indices and HbF and Hb A2 level in carriers of β-thalassaemia trait and medical students are summarised in supplementary table 1 online. Two carriers of β-thalassaemia trait, one with coexisting iron deficiency and the other with precursor B acute lymphoblastic leukaemia, had an Hb level <70 g/L. One carrier of β-thalassaemia who was father of a patient with β-thalassaemia major had a mean cell volume of 102 fl. His serum bilirubin level was raised and he might therefore have had concomitant liver disease leading to red cell macrocytosis. Three carriers had an Hb A2 level below the cut-off value of 3.5% of total HbF for β-thalassaemia trait. Their thalassaemic status was confirmed by positive detection of the β-globin gene mutation. The lowest Hb A2 level in the β-thalassaemia trait carrier cohort was 1.0%. We could not exclude a co-existing δ-thalassaemia in this subject. Significant difference is seen across all red cell parameters between carriers of the β-thalassaemia trait and medical students using the independent samples t test.

Globin genotypes

The α-globin and β-globin genotypes and the XmnI-Gγ polymorphism status of carriers of the β-thalassaemia trait and the control group are summarised in supplementary table 2 online. One carrier of β-thalassaemia trait had a concomitant triplicate α-globin gene configuration. There was no modification of her phenotype, with an Hb of 106 g/L and HbF of 1%. Of 229 carriers of β-thalassaemia trait tested, 16 (7%) showed double heterozygosity for the α-thalassaemia and β-thalassaemia trait. Of note, there was no significant difference in HbF level between subjects carrying β+ and β0 thalassaemic mutations or between subjects with or without XmnI-Gγ polymorphism (p>0.05). Globin genotyping data were not available for nine subjects (mean HbF 0.52%, range 0.3–1.0%).

Table 2 Linkage disequilibrium (LD) blocks generated from genotypes of 60 informative markers using Haploview V.4.0.

Association between single-nucleotide polymorphism genotypes and fetal haemoglobin expression

The association between SNPs and HbF was analysed in 238 carriers of β-thalassaemia trait. Of 22 SNPs genotyped in the first round, which spanned a region of 287 kb, 21 could be analysed; one SNP with a poor call rate of 0.05 was excluded. Five SNPs were found to be significantly associated with HbF expression (p<0.05 after multiple testing correction, range 0.003–0.039) (table 1). The SNP showing the most significant association with HbF was rs6934903. These five SNPs were located in a 43 kb segment in the HBS1L-MYB intergenic region, starting 39 kb upstream of HBS1L and included the recently described exon 1a of HBS1L6 (fig 1).

Figure 1 Physical positions of genotyped markers in the 6q23 region. The 64 informative SNPs are grouped into 7 haplotype blocks. The horizontal solid line indicates significance threshold (after multiple testing correction) of association with HbF level. Data for carriers of β-thalassaemia trait are shown. The 29 significant SNPs are located within a 47 kb segment in the HBS1L-MYB intergenic region; 27 of them are in blocks 4, 5 and 6.

In the second round of fine mapping of a 79 kb region in the HBS1L-MYB intergenic region, 43 out of 48 SNPs genotyped were informative. Five SNPs had a low call rate and were therefore excluded. In total, 24 SNPs were significantly associated with HbF expression (p value range 0.013–0.030 after multiple testing correction) (table 1). They were located in a 44 kb segment that largely overlapped the 43 kb segment marked by the five significant SNPs identified in the first round (fig 1). Seven non-overlapping LD blocks were generated from genotypes of the total 64 informative SNPs (table 2). Blocks 4, 5 and 6 contained 27 of the 29 significant SNPs identified in the two SNP genotyping rounds.

The association was also investigated in the control group. The same 70 SNPs were genotyped. Eight SNPs showed a poor call rate and were excluded. Of the remaining 62 informative SNPs, none was found to be significantly associated with HbF expression after correction for multiple testing.

Sequence variation in an evolutionarily conserved region in block 5

Block 5 contained the two SNPs most significantly associated with HbF level in the two genotyping rounds (rs9483788 and rs6934903). A 3 kb region conserved in mammals was found in this block. Apart from rs9483788, none of the other SNPs genotyped fell within this region. Potential sequence variations were further analysed by direct nucleotide sequencing. The rs9483788 genotype was confirmed in all 10 carriers of the β-thalassaemia trait with a high HbF genotype (C/C) and in 10 carriers of the β-thalassaemia trait with a low HbF genotype (T/T). No other sequence variation was found.


In our cohort of Chinese carriers of β-thalassaemia trait, we found that genetic variations in the HBS1L-MYB intergenic region at chromosome 6q23 are associated with HbF expression. These results independently confirm the original findings in normal subjects of North European descent.6 Remarkably, the trait-associated regions in the two studies largely overlap. The 29 SNPs that are significantly associated with HbF level in our study reside in five LD blocks (blocks 2–5) with a total length of 56 kb. These five blocks largely correspond to the most significant HBS1L MYB intergenic polymorphism (HMIP) blocks 2 and 3 in the Caucasian study, including exon 1a of HBS1L. The similar findings in these two studies in different ethnic groups and of different HBB status underpin the importance of this region in regulating HbF expression. Of note, the degree of correlation between SNP genotypes and HbF values in the present study are relatively low (lowest p value 0.003). This modest association is likely to be due to a combination of a relatively small sample size and the confounding effect of the different β-thalassaemia mutations in our cohort, which can exert significant and yet varying modifying effects on erythropoiesis and HbF production. Data from other ethnic populations with and without β-thalassaemia are needed to address this observed difference. None of the SNPs studied in Caucasians is significantly associated with HbF level in normal Chinese people. This difference in observation may be due to the poor sensitivity of HbF assay in the lower range as seen in normal subjects and to the small sample size in this group. Moreover, we cannot exclude an overall lesser contribution of the 6q23 QTL to HbF variation in Chinese subjects compared with Caucasians. A genome-wide association study to identify and assess the relative contributions of different QTLs in normal Chinese is indicated.

The functional significance of the 56 kb HBS1L-MYB intergenic region in HBG1 and HBG2 expression is still not understood. We identified a 3 kb region in block 5 (HMIP block 2+3 and block 3 in Thein et al 6) that is highly conserved in mammals and contains the second most significant SNP identified in this study (rs9483788). No further genetic variation was seen in this region after direct nucleotide sequencing. The sequence around this SNP is a potential binding site for several transcription factors. These include the erythroid transcription factor NF-E2, the transcriptional repressor Matalpha2, which regulates development in yeast, the transcriptional enhancer EFII and the transcriptional repressor Ets-1. Thein et al6 described a 102 bp conserved region at the 5′ end of exon 1a of HBS1L, which may be an alternative promoter of HBS1L. They showed that human erythroid cells with a high HbF-associated SNP haplotype in the HBS1L-MYB intergenic region had increased HBS1L expression in a two-phase liquid culture system. However, a biological link between increased HBS1L expression and enhanced HbF production has not been directly established. HBS1L has not previously been recognised as an erythroid or haemopoietic transcriptional factor. In contrast, the role of cMYB in erythropoiesis and haemopoiesis has been extensively studied. A high level is seen in immature haemopoietic progenitors18 and favours their expansion, whereas a lower level allows terminal differentiation into erythroblasts.19 Overexpression of ectopic cMYB was found to inhibit γ-globin chain production in transfected K562 cells.20 However, a correlation of MYB expression with HBS1L-MYB intergenic region genotype status could not be shown.6 Further fine mapping and functional studies in the 56 kb HBS1L-MYB intergenic region are necessary to delineate the biological significance and regulatory mechanism of this region on HBS1L and MYB expression and its link to HbF production in humans.

In conclusion, our data independently validate the significance of the HBS1L-MYB intergenic region in regulating HbF expression in a separate ethnic group that has a high prevalence of β-thalassaemia. This underpins the importance of further research on the functional significance of genetic variations in this region. Better understanding of how HbF production is regulated can lead to new strategies to modify the disease course of severe HBB disorders.


We thank S Li for performing the β globin genotyping work.


View Abstract

Supplementary materials


  • Competing interests: None.

  • Funding: This study was supported by General Research Fund Grant HKU 775307M of the Research Grant Council, Hong Kong.

  • Accession numbers: The National Center for Biotechnology Information (NCBI) Entrez database accession numbers for the genes discussed in this paper are: HBS1-like (HBS1L): GeneID: 10767. V-myb myeloblastosis viral oncogene homolog (MYB): GeneID: 4602. Haemoglobin, beta (HBB): GeneID: 3043. Haemoglobin, gamma A (HBG1): GeneID: 3047. Haemoglobin, gamma G (HBG2): GeneID: 3048

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.