Article Text


Analysis of the entire HLA region in susceptibility for cervical cancer: a comprehensive study
  1. M Zoodsma1,
  2. I M Nolte2,
  3. M Schipper2,
  4. E Oosterom2,
  5. G van der Steege2,
  6. E G E de Vries3,
  7. G J te Meerman4,
  8. A G J van der Zee1
  1. 1Department of Gynaecology, University Medical Centre Groningen, Groningen, Netherlands
  2. 2Department of Medical Biology, University Medical Centre Groningen
  3. 3Department of Medical Oncology, University Medical Centre Groningen
  4. 4Department of Medical Genetics, University Medical Centre Groningen
  1. Correspondence to:
 Professor A G J van der Zee
 Department of Gynaecology, Groningen University Medical Centre, Hanzeplein 1, PO Box 30.001, 9700 RB Groningen, Netherlands;


Background: Infection with human papillomavirus (HPV) is the main cause of cervical cancer and its precursor lesion, cervical intraepithelial neoplasia (CIN). Variability in host immunogenetic background is important in determining the overall cellular immune response to HPV infections.

Objective: To determine whether the HLA-DQ or HLA-DR genes, or others in their vicinity, are associated with cervical cancer.

Methods: Markers covering the entire HLA region were genotyped in a large sample of CIN and cervical cancer patients and in controls (311 CIN, 695 cervical cancer, 115 family controls, and 586 unrelated controls).

Results: Two markers were associated with susceptibility to cervical neoplasia, G511525 and MICA. G511525, close to the region containing the HLA-DQ and HLA-DR genes, was most strongly associated, showing a decrease in frequency of allele 221 from 6.7% to 3.3% in patients with squamous cell cancer (SCC). An association was found for MICA (allele 184) with SCC (odds ratio (OR) = 1.31 (95% confidence interval, 1.13 to 1.53); homozygotes, OR = 1.48 (1.06 to 2.06)). No associations were observed with adenocarcinoma or CIN.

Conclusions: There is an association of the region containing the HLA-DQ and HLA-DR genes with the risk of developing squamous cell carcinoma. An increased risk was observed for carriers of allele 184 at the MICA locus, in particular for homozygotes, suggesting a recessive effect.

  • AC, adenocarcinoma
  • CIN, cervical intraepithelial neoplasia
  • HPV, human papillomavirus
  • HSS, haplotype sharing statistic
  • LD, linkage disequilibrium
  • SCC, squamous cell carcinoma
  • SNP, single nucleotide polymorphism
  • HLA
  • cervical cancer
  • genetic susceptibility

Statistics from

Cervical cancer is the second most common cancer in women worldwide.1 Infection with oncogenic types of human papillomavirus (HPV) is the main causal factor of cervical cancer and its precursor lesion, cervical intraepithelial neoplasia (CIN). Although many women are infected with HPV, only a minority will develop CIN or cervical cancer. Cellular immunity may be critical in the elimination of HPV-harbouring cells. The ability to respond to HPV antigens appears to be related to the capacity of infected cells to present viral epitopes (E6 and E7 oncoproteins) effectively to T cells. The variability in host immunogenetic background, such as the human leucocyte antigen (HLA) class I or II type antigens, is an important variable in determining the overall cellular immune response to HPV infections. The MHC region encodes for the HLA molecules, encompasses 3.6 Mb, and is divided into three regions (centromeric to telomeric): class II (1.1 Mb), class III (0.7 Mb), and class I (1.8 Mb).2 The principal function of the highly polymorphic HLA molecules is to bind “non-self” peptide fragments, so that they can be optimally presented to cytotoxic T lymphocytes and natural killer cells.

Many studies have analysed different HLA II type antigens in CIN and cervical cancer susceptibility. These were recently reviewed and analysed in a pooled analysis3 which showed an association with various DQA1-, DQB1, and DRB1 alleles. A few studies on HLA I type antigens also revealed associations with HLA-B alleles,4–6 but other studies did not find any associations with HLA-A, HLA-B, or HLA-C.7,8 Recently, Engelmark et al analysed 576 individuals in a sibling pair study and found both linkage and association for the DPB1, DQB1, and DRB1 loci.9 As there is a strong linkage disequilibrium (LD) in the HLA region,10 it may well be that other genes in the HLA region are responsible for the association rather than the ones mentioned above. Thus an investigation of the whole HLA region by an extended set of markers should give an answer to this question. In our research centre, comparable studies have been carried out in testicular,11 breast,12 and colorectal cancer.13 The study on breast cancer susceptibility revealed an association with the HLA class III subregion.12 No associations were found between the HLA region and testicular11 or colorectal cancer.13

In this study, we analysed the involvement of the HLA region in CIN and cervical cancer susceptibility. Markers covering the entire HLA region were genotyped in a large sample of CIN and cervical cancer patients and in controls. As well as allele, genotype, and haplotype association analyses, we also used the haplotype sharing statistic (HSS), which analyses differences between haplotype sharing in this region among patients and controls.14,15 HSS extracts extra information from phase as compared with association analysis.14,15


Patients and controls

All patients and controls were part of the white founder population of the northern Netherlands. Patients participated in a population based study that aimed to detect CIN and cervical cancer susceptibility genes. Between November 1999 and August 2002 all CIN and cervical cancer patients who visited our outpatient clinic were asked by their physician to participate. Through the patients, family members (preferably parents or child and spouse) were also asked to participate.

To obtain more power for the association analyses, patients from our outpatient clinic who were included in earlier studies and agreed to the use of their serum in follow up studies were also included in the present study. In addition, 563 spouses of cases of two other comparable studies in different malignancies12,13 were included to complement the family based controls. All DNA samples and data in this study were handled anonymously, and individuals were aware that they would not be informed about individual test results. The study was approved by the medical ethics committee of the hospital. All included subjects gave written informed consent.


DNA was either extracted from 20 ml EDTA blood following standard procedures or from serum by using the QIAamp DNA Midi Kit (Qiagen, Valencia, California, USA) and was stored at −80°C. Subsequently, 22 polymorphic microsatellite markers and two single nucleotide polymorphisms (SNPs) in the HLA region on 6p21 were genotyped in all patients and controls. For detailed marker, primers, and probe information see table 1. All markers and SNPs used are located in the 3.6 Megabase (Mb) MHC region, in particular in the HLA class II region. The marker order was determined by using sequence data, as published by the MHC sequencing consortium.16 In addition, for the genes located in this region and the positions relative to the markers, both the NCBI map ( and the Celera maps (Celera Genomics, Rockville, Massachusetts, USA) were used.

Table 1

 Marker data and primer sequences

Microsatellite markers were amplified by polymerase chain reaction (PCR), in final reaction volumes of 10 μl, each containing ∼25 ng DNA, 0.5 units Taq DNA polymerase (Amersham Pharmacia Biotech, Uppsala, Sweden), 0.2 mM dNTP (Roche Diagnostics, Mannheim, Germany), 2.5 mM MgCl2, 10 mM Tris-HCl, 50 mM KCl (Amersham Pharmacia Biotech), and 0.25 μM of each primer (with one primer 5′ labelled with a fluorochrome 6-FAM, HEX (Sigma, Malden, Netherlands) or NED (Applied Biosystems, Foster City, California, USA). Cycling was carried out on a PTC-225 thermal cycler (MJ Research, Waltham, Massachusetts, USA) and a PrimusHT (MWG Biotech, Ebersberg, Germany). A standard protocol was used for amplification. Post-PCR multiplexing of up to eight markers having compatible amplicon length ranges was undertaken by combining 2 to 10 μl (based on signal strength) of the PCR products, followed by dialysis in 96 well membrane plates against MilliQ. A 2.3 μl sample of the pooled fragments was mixed with 2.5 μl MilliQ and 0.2 μl ET-400R size standard (Amersham Pharmacia Biotech), and separated on a MegaBACE 1000 capillary sequencer (Amersham Pharmacia Biotech) according the manufacturer’s protocol. Results were analysed using Genetic Profiler version 2.0 (Amersham Pharmacia Biotech).

For the SNPs, reactions were carried out in 5 μl volumes and contained 25 ng DNA, 1× TaqMan Universal PCR Master Mix (Applied Biosystems), 100 nM of each primer, and 900 nM of each probe. Cycling conditions on the ABI prism 7900 HT (Applied Biosystems) were two minutes at 50°C, 10 minutes at 95°C, followed by 40 cycles of 15 seconds at 92°C and one minute at 60°C. End point fluorescence was measured immediately after cycling. Alleles were assigned using SDS 2.0 software (Applied Biosystems).

Scoring of the alleles was blinded for affection status (thus patient or control) and family structure.

Statistical methods

After genotyping the 22 markers and two SNPs in all participants, they were first tested for quality by the Hardy–Weinberg equilibrium test. Markers that failed this test were not included in the analyses.

Single locus allele and genotype association analyses were carried out. For these analyses, the allele and genotype counts, respectively, were compared between patients and controls using a χ2 test if the expected counts were more than five, and a Fisher’s exact test if otherwise.

LD between the microsatellite markers was determined and haplotype frequencies were estimated by the expectation-maximisation (EM) algorithm using the software package “Arlequin”.17 For these analyses, only individuals with genotypes known for every locus in the analysis are used.

Subgroup analyses were performed for different subgroups with respect to three histological types: squamous cell carcinoma (SCC) or adenocarcinoma (AC) and CIN 2 or 3. Subgroup analyses were also undertaken for several risk factors such as age at diagnosis (before v after 45 years of age), FIGO stage of the tumour (FIGO I-IIa v IIb-IV), smoking status (smoking v not smoking), having been pregnant (yes v no), age at first pregnancy (before v after 25 years of age), age at first coitus (before v after 18 years of age), and number of partners (1 v >1).

For the trios in our dataset, a set of patient and a set of control haplotypes were determined. When DNA from parents was available for haplotype phase determination of the alleles (n = 31), the non-transmitted haplotypes from the parents were used as controls. If DNA was obtained from a child and a spouse (n = 26), both haplotypes of the spouse were regarded as controls. When only one family member (a parent, child, or sibling) was available (n = 58), the haplotype not present in the patient was used as a control haplotype. This implies that the number of alleles and haplotypes included in the analyses is less than twice the number of genotypes. These two haplotype sets were compared using HSS, a method that analyses the length of haplotype similarity. The validity of this method has been demonstrated previously both in simulation studies and in empirical data.14,15,18 HSS assumes that haplotype segments of patients from a well defined founder population are conserved in the region spanning a disease locus. In contrast, control individuals are not expected to have conserved haplotypes centred at a particular locus. Thus, patients are considered to display excess of haplotype sharing—which is defined as the number of consecutive overlapping intervals between haplotypes—as compared with controls. This excess is expected to be maximal at the marker loci closest to the susceptibility or disease locus. HSS was only applied to 17 of the 26 markers, as some of the markers were not typed in the complete trios of the two other studies that provided extra controls, but only in the spouses. Thus haplotype phase could not be derived for the non-overlapping markers.

To evaluate the effect of the polymorphisms on CIN or cervical cancer risk, odds ratios (ORs) and (corrected) 95% confidence intervals (CIs) were determined without adjusting for any external variable such as age at diagnosis, smoking status, and so on.

In the single locus allele and genotype association analyses, all patients were included, but for HSS—which requires phased haplotypes—only those with at least one family member were used.

Significant associations are defined by a probability (p) value below 0.05/24 = 0.0021—that is, at a 5% level corrected for multiple tests at 24 markers. The corrected 95% CI corresponds to an uncorrected 99.79% CI.


Hardy–Weinberg equilibrium testing showed that the observed genotype frequencies of two markers (D6S2662 and D6S2447) differed highly significantly from the expected ones. They were therefore excluded from the analyses.

In all, 311 CIN patients, 695 cervical cancer patients, 115 family based controls, and 586 unrelated controls were included. Figure 1 shows the single locus allele association analysis of the microsatellite markers and the SNPs. Two markers had a significant association with susceptibility to cervical neoplasia, namely G511525 (p = 0.0019) and MICA (p = 0.00098). The allele and genotype frequencies for G511525 (allele 221) and the MICA (allele 184) markers, the ORs and 95% CIs among cases and controls, and the different histological subgroups are shown in table 2. The genetic effect was strongest at G511525, showing a decrease in allele/carrier frequency of allele 221 from 6.7%/11.7% among controls to 3.9%/7.1% among all cases (OR for the 221 allele = 0.56 (95% CI, 0.36 to 0.89); OR for 221 carriers = 0.58 (0.34 to 0.99)). Subgroup analysis showed that this difference was most pronounced among patients with SCC (3.3%/5.6%: OR for the 221 allele = 0.47; OR for 221 carriers = 0.44). The corresponding frequencies among either CIN or AC patients did not differ significantly from controls.

Table 2

 Odds ratios and (corrected) 95% confidence intervals for the association of the allele/genotype of the markers with cervical neoplasia risk

Figure 1

 Association analysis.

Allele 184 at MICA, and in particular homozygotes, gave a higher risk of cervical carcinoma (OR for the 184 allele = 1.25 (95% CI, 1.00 to 1.56); OR for 184 homozygotes = 1.46 (1.01 to 2.12)). Again, this effect was strongest among SCC cases (OR for the 184 allele = 1.31 (1.03 to 1.67); OR for 184 homozygotes = 1.51 (1.02 to 2.24)).

HSS on a subset of the analysed markers (see Methods) did not show a difference in mean haplotype sharing between patients and controls (fig 2). Of note, because this method requires phase haplotypes, the set of patient haplotypes in this analysis was small (n = 190). Two five marker haplotype association analyses were carried out—one for TAP1-D6S2444-G511525-D6S2665-D6S2670 and one for TNFa-D6S2672-MICA-D6S2673-D6S2694. With this haplotype method, no specific haplotypes were found associated with cervical cancer risk; in particular, there were none with the associated alleles 221 at G511525 and 184 at MICA (data not shown).

Figure 2

 Haplotype sharing statistics.


To our knowledge, this is the first study to investigate the entire HLA region in a large sample of patients with cervical neoplasia. The aim of our study was to find out whether the HLA-DQ or HLA-DR genes or, because of strong LD in the HLA region, other genes in the vicinity were responsible for the association with the disease. Our study shows that a decreased risk of developing SCC is associated with marker G511525. G511525 is located in the region containing the HLA-DQ and HLA-DR genes, which have previously been implicated in susceptibility to CIN and cervical cancer.5,6,19–51 As G511525 was the most strongly associated with SCC, our study establishes the role of the HLA-DQ and HLA-DR genes in SCC development. Our results are also consistent with the finding of Engelmark et al and indicate that associations can be found with non-HLA markers that are as strong as those found with HLA alleles.9

We did not find an association between HLA and CIN or adenocarcinoma. Lack of association with the latter might reflect the relative small sample size of this subgroup. However, results from a recent study also suggested that genetic susceptibility appears to play a role particularly in invasive squamous cell carcinoma of the cervix, while such an association is not as clearly evident for adenocarcinoma.52 This supports a previous assertion that these two histological forms of cervical cancer—while sharing HPV infection as a common aetiological agent—have different aetiological co-factors, such as different prevalent subtypes of HPV.53

To further investigate the role of the HLA-DQ and HLA-DR genes in the development of cervical cancer, a denser set of SNPs, ideally all SNPs located in the region, should be analysed, because there is still no statistical evidence that the associations are really caused by HLA variations. Only large scale association studies with large numbers of samples and a high density of markers can pinpoint the most commonly associated genetic variation. Functional arguments for the causal role of an SNP need to be complemented with statistical evidence that the SNPs indicated have the strongest association with cervical cancer.

We also found an association for the MICA marker (allele 184) with squamous cell carcinoma. The association pointed in particular to patients who are homozygote for the 184 allele, which suggests a recessive effect. Although HSS did not reveal a significant association, the highest p values obtained by this method were also observed around this marker. The absence of any significance probably reflects lack of power, as HSS requires phased haplotypes and our dataset contained only 118 patients with at least one relative who could be used for this analysis. Ghaderi et al studied this locus previously and found no association with either CIN or cervical cancer,54,55 but only 58 cervical cancer patients and 78 CIN patients were included in those studies. We attribute their negative result to lack of power. As our case–control sample size was much larger, a role of the MICA gene in the development of squamous cell carcinoma is likely. Our negative result for adenocarcinoma might be a result of the small sample size in this subgroup, or of the reasons mentioned above. Recently, a study into the role of the HLA region in breast cancer development showed an association of the HLA class III subregion with increased breast cancer risk.12 This suggests that there might be a role of the MICA (MHC class I chain related gene A) in the development of cancer in general. MICA is expressed by keratinocytes and epithelial cells and interacts with γ-δ T cells. It is therefore possible that MICA might influence the pathogenesis of cancer through presentation of viral or tumour antigens. Nevertheless, another gene in the vicinity of MICA could be the one that is involved. Further research should be conducted to find out which gene in the HLA class III subregion is responsible for the observed increased cancer risk.

Despite the fact that the numbers in our study were relatively large compared with previous reports, our finding that the MICA marker was associated with an increased risk for squamous cell carcinoma must be confirmed in future studies before any clinical action—such as heightened surveillance for patients with a genotype associated with increased risk—is undertaken.


Our study further establishes an association of the region containing the HLA-DQ and HLA-DR genes with the risk of developing squamous cell carcinoma of the cervix, but not CIN or adenocarcinoma. Furthermore, an increased risk of developing squamous cell carcinoma was found for homozygotes of the 184 allele at the MICA locus, suggesting a recessive effect of the HLA class III subregion. Future goals are to type and analyse a denser set of SNPs in the HLA-DQ and HLA-DR region and in the region surrounding MICA to find the causal genes.


We thank all patients and their family members who donated their DNA to this project. This work was supported by grant RUG-99-1878 of the Dutch Cancer Society.


View Abstract


  • Competing interests: none declared

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.