rss
J Med Genet 39:225-242 doi:10.1136/jmg.39.4.225
  • Review article

Genes other than BRCA1 and BRCA2 involved in breast cancer susceptibility

  1. M M de Jong1,2,3,
  2. I M Nolte2,
  3. G J te Meerman2,
  4. W T A van der Graaf1,
  5. J C Oosterwijk2,
  6. J H Kleibeuker3,
  7. M Schaapveld4,
  8. E G E de Vries1
  1. 1Department of Medical Oncology, University Hospital, Groningen, The Netherlands
  2. 2Department of Medical Genetics, University Hospital, Groningen, The Netherlands
  3. 3Department of Gastroenterology, University Hospital, Groningen, The Netherlands
  4. 4Comprehensive Cancer Centre Northern Netherlands, Groningen, The Netherlands
  1. Correspondence to:
 Dr E G E de Vries, Department of Medical Oncology, University Hospital Groningen, Hanzeplein 1, PO Box 30.001 RB, Groningen, The Netherlands;
 E.G.E.de.Vries{at}int.azg.nl

    Abstract

    This review focuses on genes other than the high penetrance genes BRCA1 and BRCA2 that are involved in breast cancer susceptibility. The goal of this review is the discovery of polymorphisms that are either associated with breast cancer or that are in strong linkage disequilibrium with breast cancer causing variants. An association with breast cancer at a 5% significance level was found for 13 polymorphisms in 10 genes described in more than one breast cancer study. Our data will help focus on the further analysis of genetic polymorphisms in populations of appropriate size, and especially on the combinations of such polymorphisms. This will facilitate determination of population attributable risks, understanding of gene-gene interactions, and improving estimates of genetic cancer risks.

    Breast cancer is the most common cancer in women in the western world.1 In breast cancer development, genetic and environmental factors play a role with family history being the most important factor for determining breast cancer risk. This risk is a function of the number of relatives affected with breast and ovarian cancer, the degree of relationship to these relatives, and their age at diagnosis of the disease.2,3

    Hereditary breast cancer accounts for 5-9% of all breast cancers.4 It was estimated that the combination of BRCA1 and BRCA2 gene mutations was responsible for approximately 80% of the families with hereditary breast cancer.5,6 These estimates, however, may be too high owing to the way patients are selected, namely on the basis of a pronounced family history of the disease. More recent estimates put this risk at about 30%.7 Mutations of the BRCA1 and BRCA2 genes do not explain the occurrence of breast cancer in every breast cancer prone family.8 At least one other major breast cancer susceptibility gene is proposed to exist.9,10 In addition, a number of rare genetic syndromes are associated with high breast cancer risk. Together, these rare syndromes account for less than 1% of all hereditary breast cancers.11

    Apart from these well defined, high penetrance genes, there may be other genes that also increase the susceptibility to breast cancer. Candidates are proto-oncogenes and genes involved in metabolic, oestrogen, and immunomodulatory pathways.12 Each of these genes probably confers only a small (odds ratio (OR) 1-1.5) to moderate (OR 1.5-2) increase in the lifetime breast cancer risk. Because mutations in these so-called low penetrance genes are expected to be present in a large number of people, the population attributable risk (PAR) for breast cancer explained by these genes (in combination with environmental exposures) may be substantial13 and (potentially) considerably higher than the PAR caused by rare mutations of high penetrance genes such as BRCA1 and BRCA2.14,15 The published polymorphisms (a locus where two or more alleles are each present at a frequency ≥1% in the population) were, in general, studied because of their biological plausibility.16–21 Some polymorphisms may partly account for the difference in the sensitivity of women to environmental factors, such as the use of replacement oestrogens.20 In subjects carrying low penetrance gene mutations, environmental factors might especially affect the risk of developing breast cancer. One subject may be 10-200 times more sensitive than another22 and may therefore develop cancer, while others at the same level of exposure will not. With the identification of important low penetrance gene mutations along with their interaction with environmental factors, specific prevention may become possible. Research on low penetrance genes involved in breast cancer is still in its infancy. Whole genome screens for determining low penetrance genes are currently not yet financially feasible.

    This review focuses on genes other than BRCA1 and BRCA2 that may be involved in breast cancer susceptibility. Although mutations or polymorphisms in many of the genes described can also play a role in other types of common cancer, such as colorectal, ovarian, or prostate cancer, this review addresses only breast cancer. The aim of the pooled analysis was to find polymorphisms that may either have a causative relation to breast cancer or that are in strong linkage disequilibrium (LD) with breast cancer causing variants (for example, situations where certain haplotype combinations of alleles at different loci occur more frequently than would be expected from random association). We studied only the relation of the polymorphisms to breast cancer risk, since we assumed that environmental factors will play an equal role in all studies. The external variables taken into account were, where possible, menopausal state and ethnicity. Pooled analyses were performed on all polymorphisms. In addition, the sample size required to detect an association with breast cancer susceptibility with sufficient power was addressed.

    METHODS

    Search of published studies

    Published studies were identified using the PubMed databases from 1980 to 2000, using the search terms “breast”, “cancer”, “risk”, and “polymorphism(s)”. For each specific candidate gene, a separate search was performed. For example, the terms “HRAS1”, “breast”, “cancer”, and “risk” were used for HRAS1. In addition, the references of studies identified by the electronic searches were also evaluated. Studies eligible for our pooled analysis were those that compared genotype or allele frequencies of candidate genes in breast cancer cases with non-breast cancer controls using genomic DNA. The polymorphisms reported in breast cancer patients and controls in more than one study are described separately. When more than one polymorphism in one gene (for example, CYP1A1) or polymorphisms in different genes in the same region (for example, the HLA region) were examined in only one single study, they are also included. The different genes along with their localisation and presumed function are presented in table 1.

    Table 1

    Breast cancer susceptibility genes with their localisation and presumed function

    Lay out per gene

    For the known genetic syndromes associated with increased breast cancer susceptibility, the possible germline mutations in familial and sporadic breast cancer patients were addressed, followed by somatic mutations, loss of heterozygosity (LOH, the loss of one of the two alleles at a given locus in a tumour), and hypermethylation. According to Knudson's “two hit theory”,23 the two hits required for tumour development are generally thought of as an intragenic mutation (germline or somatic) and LOH. Recently, it has been shown that hypermethylation of the promoter region of a gene can also be one of the hits required in cancer development24,25 because this can silence the gene.26

    Pooled analysis

    For a better insight into the possible effects of the various genes on breast cancer susceptibility, a pooled analysis for each polymorphism was performed. The results of this analysis are shown in table 2. The raw numbers of cases and controls from comparable studies were analysed together. The genotype specific ORs and the 95% confidence intervals (CIs) were calculated for all studies combined, without adjustment for external variables. This can result in values that differ from those in the original article. Whenever possible, a distinction was made between women heterozygous and homozygous for the variant allele. Where metabolic polymorphisms are assumed to be associated with a specific phenotype, a distinction was made between phenotype and genotype based studies (for example, CYP2D6, NAT2). For the genotype studies, the genotypes were combined according to phenotypic classes. Also, where possible, separate analyses were performed for the three major ethnic subgroups, white, African-American, and Asian.

    Table 2

    Genetic polymorphisms and their allele frequencies, total number of cases and controls published, risk genotypes with their ORs and 95% confidence intervals, PAR, and sample size required to detect an association

    The ORs and the 95% CIs of the studies used in the pooled analysis of polymorphisms in three genes are shown in figs 1 (the HRAS1 polymorphism), 2 (the PROGINS polymorphism of the PR gene), and 3 (the polymorphisms in the vitamin D receptor (VDR) gene).

    Figure 1

    The HRAS1 polymorphism and breast cancer risk. The results of 12 studies (OR and 95% CI) are depicted as well as the result of the pooled analysis comprising 2029 cases and 3252 controls.

    Figure 2

    The PR gene and breast cancer risk. The results of four studies (OR and 95% CI) are depicted and the results for heterozygosity or homozygosity of the variant allele are given separately and in a pooled analysis, comprising 1106 cases and 965 controls. a W: wild type. b V: variant allele.

    Figure 3

    The vitamin D receptor gene and breast cancer risk. The results of five studies (OR and 95% CI) are depicted in the graph and the data for five polymorphisms are given separately, with the distinction between heterozygosity and homozygosity for the variant allele. For BsmI the total is 231 cases and 467 controls, for FokI 278 cases and 410 controls, and for TaqI 1197 cases and 867 controls. a W: wild type. b V: variant allele.

    The pooled analysis also shows the PAR and sample size required to detect the association with breast cancer for each polymorphism, with a power of 90% and a significance level of 0.0026 corrected for multiple testing (also shown in table 2). A description of the sample size calculation is given in Appendix 1.

    RESULTS

    The unknown “BRCA3” gene

    Based on several studies, a region on chromosome 8p11-21 is considered to be involved in hereditary and sporadic breast cancer. Linkage analysis in eight French breast cancer families showed a multipoint lod of 2.51 (a lod score >3.0 is the accepted statistical significance level for linkage of a genetic locus with a disease) with two markers on chromosome 8p.27 Linkage analysis in two large German breast cancer families, with negative lod scores for the BRCA1 and BRCA2 locus, showed a multipoint lod score of 3.30 at two other markers, localised between the two markers in the French study, on chromosome 8p.28 In studies focusing on chromosome 8p, LOH was observed in 46-74% of unselected human breast tumours,27,29–36 in 86% of the familial tumours,32 in 78% of tumours in women with a specific BRCA2 mutations (that is, the 999del5 mutation),36 and in 83% of male breast tumours.37 In ductal breast carcinoma in situ, LOH on 8p was observed in 0-37% of cases,30,38,39 suggesting LOH on 8p is associated with invasive behaviour of the tumour. Based on these LOH studies, there are at least two different regions of minimal overlap of LOH on chromosome 8p. The region on 8p discussed earlier27,28 is localised in one of these regions. Mapped genes in this region include hEXT1L,34WRN,40 and LHRH.41,42 No somatic or germline mutations have, as yet, been detected in these genes in breast cancer cases.

    Rare genetic syndromes with increased breast cancer risk

    The Tp53 gene and Li-Fraumeni syndrome

    Inactivating mutations in the Tp53 gene have been found in many tumour types43,44 including breast cancer.45 Li-Fraumeni syndrome is an autosomal dominant disorder, caused by germline mutations in the Tp53 gene. This syndrome is characterised by an increased risk of soft tissue and osteosarcomas, leukaemias, brain tumours, adrenocortical carcinomas, and breast cancers.46 The risk of developing breast cancer before the age of 45 is 18-fold higher for affected females as compared to the general population.46 The excess is greatest below the age of 20 and declines with increasing age (relative risk (RR) for breast cancer after the age of 45 = 1.8).46 Germline mutations in the Tp53 gene have been estimated to account for less than 1% of breast cancer cases.47–51 However, somatic mutations in the Tp53 gene are reported in 19-57% of human breast cancers52–57 and LOH is found in 30-42%.52,54 There is no association between somatic Tp53 mutations and LOH at the Tp53 locus,52–55 suggesting that one inactivated allele may be sufficient for breast cancer development.54 Hypermethylation of the promoter region of the Tp53 gene does not play a role in breast cancers.58

    Three different Tp53 polymorphisms (in intron 3, exon 4, and intron 6) have been studied in breast cancer patients. All three polymorphisms exhibit strong linkage disequilibrium with each other.59 In five studies examining the intron 3 polymorphism, none found an association with increased breast cancer risk.60–64 Surprisingly, the breast cancer risk for homozygous carriers of the variant allele was decreased. When all studies were combined, the OR for heterozygous carriers of the variant allele was 0.97 (95% CI 0.79-1.18) and was 0.46 (95% CI 0.25-0.84) for homozygous carriers. The exon 461–65 and the intron 661–64,66,67 polymorphisms showed similar results. Thus, a decreased breast cancer risk for homozygous variant allele carriers was found for all three polymorphisms in the Tp53 gene. These homozygous variant allele carriers comprise 3% (intron 3), 13% (exon 4), and 3% (intron 6) of all the women and, thus, up to 13% of women have decreased breast cancer risks.

    Four studies61–64 examined all three polymorphisms in the Tp53 gene in relation to breast cancer risk. In one of the two studies examining haplotypes, an association was observed between the haplotype composed of the three variant alleles and the risk of breast cancer among white populations (OR=2.18, 95% CI 1.17-4.07).62 This association was not found in Hispanic (OR=0.24, 95% CI 0.05-1.11) or African-American patients (OR=1.13, 95% CI 0.46-2.81)62 nor in patients from Pakistan (OR=0.77, 95% CI 0.38-1.56).64 The two other studies61,63 did not construct haplotypes, but compared genotype combinations. The first study found a marginally significant association between breast cancer and the genotype combination that is heterozygous for all three polymorphisms (OR=1.68, 95% CI 0.99-2.86).63 This genotype combination did not exclude the haplotype composed of three variant alleles from being at risk. The second study found associations between breast cancer and two genotype combinations.61 With the first genotype combination (OR=2.94, 95% CI 1.37-6.27), heterozygous for the intron 3 and 6 polymorphisms and homozygous for the exon 4 variant allele, the haplotype of three variant alleles is still supported. With the second genotype combination (OR=1.61, 95% CI 1.13-2.30), homozygous for the intron 3 and 6 wild type allele and heterozygous for the exon 4 polymorphism, the variant allele haplotype is not possible. The fact that the four studies used different methods to examine the polymorphisms hampers the comparison of results. However, the analysis showed that the haplotype composed of the three variant alleles is associated with an increased breast cancer risk, particularly in white breast cancer patients.

    The ATM gene and ataxia telangiectasia

    Most A-T patients do not survive to an age at which breast cancer generally occurs.68 A-T carriers (heterozygous for ATM mutations) are sensitive to late onset apoptosis after x ray irradiation owing to accumulation of cell cycle checkpoint abnormalities.69 In several studies, A-T carriers appear to have an increased breast cancer risk (OR 3.3-870–76 and PAR 3.8-8.5%68,73–75). However, in all studies the OR was determined with the observed/expected method and all groups were small. One study found no increased breast cancer risk among A-T carriers.77 The risk of A-T carriers to develop breast cancer is estimated to be 11% by the age of 50 and 30% by the age of 70.78

    Germline mutations in the ATM gene are rare in breast cancer families without features of A-T.79,80 In sporadic breast cancers, germline and somatic mutations in the ATM gene are also rare,81,82 even in young patients83,84 and patients with bilateral breast cancer.85 In 88 breast cancer patients with a family history of breast cancer and leukaemia or lymphoma, three germline mutations in the ATM gene have been found.79 Chen et al80 examined these three mutations and none appeared to be causal. In 82 Dutch breast cancer patients (diagnosed before the age of 45, >5 years survival) including 33 bilateral cases, seven germline mutations were found, one out of frame splice site mutation (detected three times), three truncating mutations, and one in frame deletion.86 It was hypothesised that the existence of two distinct classes of A-T mutations (truncating and missense) might explain some of the seemingly contradictory data on cancer risk associated with the ATM gene.87,88 The truncating mutations act as null mutations because they produce low cellular levels of an unstable ATM protein. Because truncating mutation carriers have 50% of wild type ATM activity, they will have an almost normal phenotype. Some missense mutations encode stable, but functionally abnormal proteins that are present at normal intracellular levels. These proteins could compete in complex formation with the normal ATM protein, resulting in a dominant negative cellular phenotype. The functional loss in ATM missense mutation carriers might be more severe than in ATM truncating mutation carriers and, thus, only ATM missense mutations might be associated with an increased cancer risk.88 In most studies, A-T carrier detection in breast cancer cases was based on the protein truncation test and could only detect truncating mutations. Support for the existence of two functionally distinct classes of mutations can be derived from a study describing an increased breast cancer risk in two A-T families with a specific ATM missense mutation (T7271G) in both homozygotes and heterozygotes, with an age specific incidence rate based OR of 12.7 (95% CI=3.53-45.9).89 This mutation results in an aberrant full length ATM protein level comparable with unaffected subjects.89 Another explanation for the seemingly contradictory data on breast cancer risk is that the carrier frequency of A-T mutations could be much lower than the described 1% of the general population causing a low PAR. If this is the case, the OR for breast cancer in A-T carriers can be high (as found in the A-T families), while mutations in the ATM gene are rarely detected in sporadic breast cancer patients.

    A recent study of 138 Austrian hereditary breast and ovarian cancer (HBOC) patients without BRCA1 and BRCA2 mutations90 showed functionally significant ATM germline mutations in at least 8.7% of the HBOC patients. The penetrance for one of the mutations (L1420F) was estimated to be 85% at age 60.

    In conclusion, although the exact association remains unclear, a role for the ATM gene in breast cancer susceptibility is plausible.

    The PTEN gene and Cowden syndrome

    Cowden syndrome is an autosomal dominant disorder, characterised by the development of hamartomas and benign tumours. Mutations in the PTEN gene are present in 80% of Cowden syndrome families.91,92 Truncating PTEN mutations in Cowden syndrome families are associated with cancer93 and cause a 25-50% lifetime breast cancer risk in women.92,94,95 No mutations in the PTEN gene have been detected in breast cancer families and families with breast and brain cancer without features of Cowden syndrome.96–99 In sporadic breast cancer patients, germline and somatic mutations in the PTEN gene are rare100–104 even in young patients.96,105 LOH at the PTEN locus is found in 11-41% of sporadic breast cancers,102–104,106 but no somatic mutations have been observed in the remaining allele.103,104,106 It is, however, still possible that an epigenetic phenomenon such as hypermethylation of the promoter region inactivates the remaining allele.107 In one study (in 177 breast cancer patients with a positive family history for breast cancer and without BRCA1 and BRCA2 mutations), an association was found between a polymorphism in intron 4 of the PTEN gene and a lower age of diagnosis of breast cancer (42.7 versus 48.1 years).98 No comparison, however, was made with healthy controls. In conclusion, the PTEN gene is not likely to play a role in classical hereditary breast cancer. In sporadic breast cancers, LOH at the PTEN locus is detected, but since no alterations have been found in the remaining allele, it is not currently known whether PTEN plays a role in sporadic breast cancer susceptibility.

    The LKB1 gene and Peutz-Jeghers syndrome

    Peutz-Jeghers syndrome is an autosomal dominant disorder characterised by hamartomatous polyps in the small bowel and pigmented macules of the buccal mucosa, lips, fingers, and toes.108 This syndrome is caused by truncating germline mutations in the LKB1 gene.109,110 Patients with Peutz-Jeghers syndrome have an increased breast cancer risk.108,111 No germline mutations were detected in 22 patients from 14 breast cancer families with LOH on chromosome 19p.112 In 62 primary breast cancers in women without Peutz-Jeghers syndrome, no somatic mutations were found in the LKB1 gene and LOH was observed in only 8%.113 In conclusion, the LKB1 gene seems to play a role in breast cancer susceptibility, but only in patients with Peutz-Jeghers syndrome.

    Low penetrant breast cancer susceptibility genes

    There are several classes of potential low penetrance breast cancer susceptibility genes, such as proto-oncogenes, metabolic pathway genes, oestrogen pathway genes, and immunomodulatory pathway genes.

    Proto-oncogenes

    Proto-oncogenes are involved in the regulation of normal cell growth and differentiation. Mutations in proto-oncogenes lead to disturbances in the cell cycle and can result in abnormal growth or proliferation.114 Well known proto-oncogenes are the RAS genes, the HER2 gene, and the myc genes.

    HRAS1

    The HRAS1 gene encompasses four exons flanked by a variable tandem region repeat at the 3′ end.115,116 This minisatellite locus is composed of four common alleles (94% of the white population117) and dozens of variants, the so-called intermediate and rare alleles. Each variant allele is derived from the common allele nearest in size to it.118 The HRAS1 polymorphism was examined in 13 studies.119–131 Positive ORs were detected in all studies (fig 1), five of which reached significance119,124,128,129,131 with ORs of 2-7. Combining the studies showed an association between rare HRAS1 alleles and breast cancer (OR=2.03, 95% CI 1.72-2.40), with a PAR of 14%. There are, however, several methodological problems in performing this pooled analysis, since the choice of the cut off point between rare, intermediate, and common alleles is difficult to make and the distribution of the alleles in subgroups of the population varies between studies.132 The choice of rare alleles (frequency <4%) is not the same for all studies and does not correspond to any previous biological interpretation.132 In most studies there are four common alleles and the rest of the alleles are listed as rare. With this as the criterion, our pooled analysis indicated that the rare HRAS1 alleles are associated with a moderately increased risk of breast cancer.

    L-myc

    Three studies focused on L-myc and breast cancer risk.133–135 In one study, no controls were examined.134 Another study found an association with breast cancer135 for women heterozygous (OR=2.25, 95% CI 1.12-4.51) and homozygous (OR=2.63, 95% CI 1.22-5.68) for the variant allele. When all three studies were combined for our pooled analysis, no association with breast cancer was found.

    Metabolic pathway genes

    Enzymes involved in metabolic pathways are of interest because of their possible role in (de)toxification of chemical compounds.136 A number of metabolic pathway genes, including the cytochrome p450 family, the GST family, and the NAT1 and NAT2 genes, are thought to have evolved as an adaptive response to environmental exposure to toxins, including some carcinogens. The prediction is therefore that any alteration in the activity of these enzymes would result in an altered susceptibility to potentially toxic (mutagenic) compounds. This may determine the rate at which somatic mutations occur in genes in response to environmental exposures, resulting in an altered cancer susceptibility. The cytochrome p450 family proteins are known as phase I enzymes. In general, these enzymes metabolically activate carcinogens.137 A genotype associated with an increased phase I enzyme activity might therefore increase breast cancer risk.138 The NAT and GST family proteins are known as phase II enzymes. These enzymes metabolically inactivate carcinogens. The substrates for phase II enzymes include carcinogenic compounds activated by the phase I enzymes. A genotype associated with decreased phase II activity might therefore increase breast cancer risk.138

    N-acetyl transferase (NAT)

    Both NAT1 and NAT2 are polymorphic with so-called fast and slow phenotypes. Slow acetylators produce proteins that are either poorly expressed, are unstable, or have partially reduced catalytic activities.139 In theory, having the slow acetylator phenotype could mean that aromatic amines are metabolised more slowly and that slow acetylators might, therefore, be at increased breast cancer risk.114

    NAT1

    The NAT1*10 allele is associated with the rapid acetylation phenotype; all other alleles represent slow alleles.139–141 This allele is present in 30% of populations of European ancestry.142 Two studies found no association with breast cancer risk either separately or combined.143,144 The NAT1 polymorphism does not appear to play a major role in breast cancer susceptibility, although a small increase in breast cancer risk cannot be excluded.

    NAT2

    The acetylation capacity in NAT2*4 homozygotes in vivo is higher than in NAT2*4 heterozygotes and all variant alleles have lower acetylation capacities.145 The population frequency of the fast acetylator genotype of the NAT2 gene is 22-78%.145–147 None of seven studies found an increased breast cancer risk for the slow acetylator NAT2 phenotype,148–154 while two studies found a decreased breast cancer risk.148,153 No association with breast cancer risk was found when all studies were combined. The results for the NAT2 genotype were similar. No effect on breast cancer risk was found when the studies were combined in our pooled analysis.143,155–159 In conclusion, the NAT2 polymorphism does not play a role in breast cancer susceptibility.

    Combination of NAT1 and NAT2

    One study examined both polymorphisms.143 No association between these polymorphisms and breast cancer was observed for either the NAT1 or NAT2 genes separately or combined.

    Glutathione S-transferase (GST) family

    Deletion variants that are associated with a lack of enzyme function exist at GSTM1 and GSTT1.160 Homozygotes for null deletions in the GSTM1 and/or GSTT1 genes may have an impaired ability to eliminate carcinogens metabolically and may therefore be at increased cancer risk.

    GSTM1

    GSTM1 is polymorphically expressed as GSTM1-0 or null (homozygous deletion) and GSTM1a and GSTM1b.160 Between 20 and 60% of the general population are homozygous null for the GSTM1 gene.161–164 The GSTM1 null variant has been well examined in breast cancer studies with varying results.165–178 Our pooled analysis showed an association between this polymorphism and breast cancer risk, although this increase is very small and only marginally significant (OR=1.13, 95% CI 1.00-1.26). The combined sample size is large enough to exclude a moderate increase in breast cancer risk for GSTM1 null homozygous carriers.

    GSTP1

    In 31% (24/77) of the breast cancer cases, hypermethylation of the GSTP1 promoter region was detected.179 A polymorphism, the isoleucine to valine substitution at codon 105, has been associated with reduced conjugating activity of the gene.180 This polymorphism has been examined in three studies.170,176,181 Our pooled analysis showed a moderately increased breast cancer risk for women homozygous for the Val allele (OR=1.86, 95% CI 1.05-3.3) with a PAR of 14%. Thus, the GSTP1 polymorphism appears to play a role in breast cancer susceptibility, although the total number of cases (n=301) and controls (n=397) was small.

    GSTT1

    The GSTT1 gene has two functionally different genotypes, GSTT1-0 or null (homozygous deletion) and GSTT1+ (one or two undeleted alleles).160 The GSTT1 null genotype has been linked to increased DNA damage from experimental carcinogens.182 In different populations, 9-64% are homozygous null for the GSTT1 gene.164,182–184 Six studies examining this polymorphism showed no association with breast cancer.169,170,172,175–177 The results were similar for premenopausal and postmenopausal women. Based on the combined sample size, a moderate increase in breast cancer risk can be excluded for women homozygous null for the GSTT1 polymorphism.

    Combination of GSTM1, GSTT1, and GSTP1 polymorphisms

    Six studies169,170,172,175–177 examined both GSTM1 and GSTT1 polymorphisms and breast cancer risk and four of them170,172,176,177 also analysed the GSTP1 polymorphism. In one study, no controls were examined and it was therefore not included in our pooled analysis.

    Of the five studies that examined the combination of two of the GST genes, one found an association between the two risk genotype and breast cancer for all three combinations of GST genes.170 Another detected an association between the two risk genotype for the GSTM1 and GSTT1 gene and breast cancer.177 Pooled analysis showed an increased breast cancer risk for the one and the two risk genotype of all three gene combinations, although only the two risk genotype of the GSTM1 and GSTP1 combination reached significance (OR=1.65, 95% CI 1.00-2.71).

    For the two studies where all three polymorphisms were examined, the results were similar.170,176 The study that observed associations with the two risk genotypes also showed an association between the three high risk alleles and breast cancer.170 When both studies were combined, increased breast cancer risks were observed for carriers of the one, two, or three risk genotype, although only the two risk genotype reached significance (OR=1.90, 95% CI 1.13-3.22).

    Cytochrome p450 family

    Certain substrates, including almost all carcinogens, are metabolically activated by cytochrome p450 metabolism, which results in the formation of mutagenic, chemically reactive electrophiles. Most prescribed drugs are substrates for one or more cytochrome p450 isoenzymes.17 Individual cytochrome p450 isoenzymes have a unique substrate specificity, although a certain overlap between the enzymes is observed.17

    CYP1A1

    This gene codes for aryl hydrocarbon hydroxylase (AHH).114 AHH is strongly inducible; differences in xenobiotic metabolic activity between subjects even within a family can be over 200-fold.185 Changes in AHH activity, resulting in different oestrogen levels, could affect breast cancer risk.16,114 Four polymorphisms have been described in the CYP1A1 gene, namely the m1 polymorphism, the m2 polymorphism (associated with increased enzyme activity in vitro, both for homozygotes and heterozygotes for the variant allele186,187), the m3 polymorphism (only present in African-Americans), and the m4 polymorphism. The m1 and m2 polymorphisms are in linkage disequilibrium,17 whereas the African-American specific polymorphism (m3) does not cosegregate.17 Five studies examined the m1 polymorphism with varying results.169,188–191 Combining all studies of white populations169,188,189,191 showed an association between heterozygous carriers of the variant allele and an increased breast cancer risk (OR=1.33, 95% CI 1.06-1.66). Two of the studies also examined African-Americans.169,191 No association with breast cancer was observed, but the total number of cases (n=84) and controls (n=177) was small and a (moderately) increased risk cannot be excluded. No overall association was detected in a Chinese study.190 An increased breast cancer risk was found for postmenopausal women homozygous for the m1 variant allele (OR=2.97, 95% CI 1.14-7.76), but the number of subjects was small (78 cases and 81 controls). When all studies, regardless of ethnicity, were combined, no association was found between breast cancer and the m1 polymorphism.

    No association with breast cancer risk was found in eight studies examining the m2 polymorphism16,167,169,189–192 regardless of whether they were analysed separately or were combined. However, by combining the two studies which analysed postmenopausal women only,167,190 an association with breast cancer was found for both heterozygous carriers of the m2 variant allele (OR=1.59, 95% CI 1.07-2.37) and women homozygous for the variant allele, although the latter association is not significant (OR=2.53, 95% CI 0.92-6.96). This is probably because of lack of power. The African-American specific m3 variant allele does not seem to play a role in breast cancer susceptibility.169,191 No association with breast cancer was observed in the one study169 that examined the m4 polymorphism.

    In conclusion, a small increased breast cancer risk was found in the white population for the m1 polymorphism and a moderately increased breast cancer risk in postmenopausal women was detected for the m2 polymorphism. A moderate increase in breast cancer risk for variant allele carriers cannot be excluded for all four polymorphisms owing to lack of power. Additional data are required to define the precise association between this gene and breast cancer, particularly in the white population and in postmenopausal women.

    CYP1B1

    The CYP1B1 enzyme exceeds other p450 enzymes in both oestrogen hydroxylation activity and expression in breast tissue.193 Four polymorphisms have been described in this gene and all variants have higher hydroxylation activity.193 These variant alleles may be associated with changes in oestrogen metabolism and therefore breast cancer risk. Three of these polymorphisms were examined in breast cancer patients and controls. The codon 432 polymorphism was examined in three studies.194–196 Large differences in variant allele frequencies were found between different populations with the variant allele frequency ranging from 0.15 to 0.68, with even large differences between two Asian studies.195,196 When all studies were combined, our pooled analysis found no association between the codon 432 polymorphism and breast cancer. The study examining the codon 119 polymorphism detected an association with an increased breast cancer risk in women heterozygous for the variant allele (OR=1.62, 95% CI 1.15-2.29).195 However, in women homozygous for the variant allele, a non-significant decrease in risk was found (OR=0.6, 95% CI 0.11-3.31). No association with breast cancer was observed for the codon 453 polymorphism.194

    One study examined both the codon 119 and codon 432 polymorphisms.195 The two polymorphisms were genetically independent and no association with an increased breast cancer risk was found for any combination of them.

    CYP2D6

    The CYP2D6 variant allele is the result of a deletion of a 17.5 kb region including the entire CYP2D6 gene.197 In white populations, 5% are homozygous for this polymorphism.17,198 These poor metabolisers are unable to metabolise agents such as debrisoquine and codeine.17 Seven studies, with varying results, examined this polymorphism, three phenotypically136,199,200 and four genotypically.197,201–203 When the phenotype studies were combined, a moderately increased breast cancer risk was found for poor metabolisers (OR=2.22, 95% CI 1.39-3.55) with a PAR of 8%. When the genotype studies were combined, an association was detected for carriers (homozygous and heterozygous combined) of the variant allele (OR=1.49, 95% CI 1.26-1.77), with a PAR of 6%. In conclusion, this polymorphism may play a role in increased breast cancer susceptibility.

    Oestrogen pathway genes

    Experimental, clinical, and epidemiological studies show that oestrogen and progesterone play a major role in growth and differentiation of normal breast tissue.114,204 A prolonged or increased exposure to oestrogen is associated with increased breast cancer risk.205,206 Endogenous and exogenous hormones stimulate cell proliferation, and thus enhance the chance of accumulating random genetic errors. The most widely accepted risk factors for breast cancer such as age at menarche, age at first pregnancy, number of pregnancies, breast feeding, age at menopause, and obesity,1,12,207 can be considered measures of the cumulative dose of oestrogen that breast epithelium is exposed to over time.205,208 Several oestrogen metabolites can directly or indirectly cause oxidative DNA damage.209,210 In conclusion, genes involved in the metabolism of sex hormones (that is, oestrogens) are interesting candidates for breast cancer susceptibility genes.207,211

    Cytochrome p450 family
    CYP17

    A polymorphism in the CYP17 gene was detected in the 5′ untranslated region. The variant allele of this polymorphism has an additional SpI type promoter site. Since it is thought that the number of 5′ promoter elements correlates with promoter activity,114 women with this allele might have higher oestradiol levels.212 An association between the presence of at least one variant allele and an increased serum oestrogen and progesterone level at day 11 and day 22 of the menstrual cycle is found in young, nulliparous women.213 One male breast cancer study (64 cases and 58 controls) observed an increased risk for variant allele carriers (OR=2.10, 95% CI 1.04-4.27).214 Ten studies on female breast cancer examined this.18,214–221 One found an association (OR= 1.99, 1.15-3.45) between variant allele carriers (homozygous and heterozygous) and breast cancer in young women (<37 years of age),222 but the number of cases (n=109) and controls (n=117) were small. Six other studies in premenopausal women214,216,218,219,221,223 showed no association between this polymorphism and breast cancer. When all studies were combined in our pooled analysis, no association with breast cancer was found. In conclusion, based on the combined sample size, even a small increase in breast cancer risk overall can be excluded. However, because the studies did not further discriminate for age, an increased risk for breast cancer in young women carrying the variant allele cannot be excluded.

    CYP19

    Several polymorphisms have been described in the CYP19 gene. A tetranucleotide repeat polymorphism, (TTTA)n, is located in intron 4, about 80 nucleotides downstream from exon 4.224 Our pooled analysis of the five studies examining the (TTTA)10 allele polymorphism225–229 showed an OR of 1.59 (95% CI 1.01-2.48), with a PAR of 1%. There is, however, a problem in performing a pooled analysis on this polymorphism, because the studies use different methods to detect the alleles. Two studies found eight different alleles while the three others found seven, six, and five respectively. Two other polymorphisms in intron 4 and 6 described in one study were in strong linkage disequilibrium with the tetranucleotide polymorphism.229 No association with breast cancer was detected for either polymorphism. Another polymorphism (codon 264) also showed no association with breast cancer.230 In conclusion, this gene might play a (minor) role in breast cancer susceptibility.

    Oestrogen receptor (ER) gene

    ER is a critical determinant of cellular responsiveness to oestrogen and is thought to play an important role in breast cancer promotion.114 Germline and somatic mutations in the ER gene in breast cancer cases are rare.231 Five somatic mutations in the ER gene have been found in only four out of 300 human breast tumours.232–234 Methylation of the promoter region of the ER gene was detected in 25% of ER negative breast cancers, while no methylation was found in ER positive tumours and normal breast specimens.235 Several polymorphisms in the ER gene have been described. The PvuII polymorphism in intron 1 was examined in three studies,233,236,237 one of which did not use controls.233 The two other studies found no association with breast cancer, although the genotypes of cases (and not of controls) were only given in one.237 When the three studies were combined, with control genotypes from one study,236 no association was found. The total number of controls (n=53), however, is very small. The XbaI polymorphism237 showed a decreased breast cancer risk (OR=0.50, 95% CI 0.25-0.99) for homozygous carriers of the variant allele (10.5 kb allele). In this study, both the number of cases (n=191) and controls (n=204) was small. A third polymorphism, in codon 325, is located in the hormone binding domain and might therefore be correlated with the ER function.234 This was examined in three studies,233,234,238 including one study without controls.233 No association with breast cancer was found when the three studies were combined.

    Surprisingly, only five relatively small studies examined polymorphisms in the ER gene. Owing to the small sample sizes, an association with breast cancer risk can neither be confirmed nor excluded.

    Progesteron receptor (PR) gene

    Methylation of the CpG islands in the 5′ region of the PR gene was found in 40% of PR negative breast cancers cases (6/15) and not found in 15 PR positive tumours or normal breast specimens.235 A polymorphism in intron 7 of the PR gene has been described. The variant PROGINS allele consists of a 306 bp insertion of the Alu subfamily.239 Four studies studied this polymorphism in relation to breast cancer risk.240–243 Although the studies observed different ORs for heterozygous carriers, the ORs for women homozygous for the PROGINS allele were similar, ranging from 0.27-0.63 (fig 2). Pooled analysis of these studies showed that the OR for women homozygous for the variant allele was 0.32 (95% CI 0.16-0.65). Thus, instead of an increased risk, four studies separately and combined found a decreased breast cancer risk for homozygous carriers of the PROGINS allele.

    Androgen receptor (AR) gene

    A mutation in the AR gene was detected in three male breast cancer patients with (partial) androgen resistance, two brothers244 and one sporadic patient.245 An increased risk of breast cancer was found in women with BRCA1 mutations (165 women with breast cancer, 139 without) if they inherited at least one AR allele with >27 CAG repeats.246 Two other studies in sporadic breast cancer patients found no association between the number of CAG repeats and breast cancer risk (in total, 876 cases and 810 controls).247,248 This polymorphism therefore does not appear to play a major role in breast cancer susceptibility.

    Catechol-O-methyltransferase (COMT)

    A polymorphism was identified in the COMT gene at codon 158249; the normal allele was designated COMT-H and the variant allele COMT-L. The variant allele encodes a thermolabile form of the enzyme with reduced activity. Four studies examined this polymorphism.19,219,250,251 No increased breast cancer risk (overall, for premenopausal or for postmenopausal women) was found when all studies were combined.

    Uridine diphospho-glucuronosyltransferase 1A1 (UGT1A1) gene

    A TA repeat polymorphism has been reported in the promoter region of the UGT1A1 gene. Increasing the number of repeats in this polymorphism leads to a decrease in enzyme activity.252,253 The wild type allele (UGT1A1*1) contains six TA repeats and the most common variant allele (UGT1A1*28) seven. Two other variant alleles (UGT1A1*33, five repeats and UGT1A1*34, eight repeats) have been found almost exclusively in the African-American population.252,254 Among premenopausal women, an association was found in an African-American population.253 An increased breast cancer risk (OR=1.8, 95% CI 1.0-3.1) was detected for heterozygous and homozygous carriers of the variant alleles with a decreased enzyme activity (UGT1A1*28 and UGT1A1*34). In a white population in another study, no association was found.254 No association between breast cancer risk and the UGT1A1*28 allele was detected when the studies were combined.

    HLA region

    The principal function of the highly polymorphic HLA antigens is to bind peptide fragments, so that they can be optimally presented to cytotoxic T lymphocytes and natural killer cells.255 The HLA antigens play a major role in immunity, self-recognition, and cell and tissue differentiation. Several studies observed no association with breast cancer.256–259 Other studies have indicated that different HLA antigens may either be risk factors for or protective against breast cancer.260–263 No strong associations with specific alleles were found and (some of) the results were contradictory. In one study, a family was examined in which more than 40% of the members of two generations had cancer (mostly breast, endometrial, and gastrointestinal).264 Positive lod scores to markers within or near the HLA region were found. None of the lod scores, however, reached significance.264 In conclusion, several reports have indicated that different HLA alleles may be risk factors for or protective factors against cancer. No clear associations with specific alleles have been detected.

    Tumour necrosis factor α (TNFα) gene

    The TNFα gene is a central mediator in the inflammatory response and immunological activities towards tumour cells.265–267 One polymorphism in the TNFα gene occurs in a series of repeating conserved motifs and is not randomly distributed. It therefore most likely has some functional and selective effect.268 The rare TNF2 allele of this polymorphism lies on the extended haplotype A1-B8-DR3-DQ2,266 which is associated with autoimmunity and high TNFα production.269,270 A comparison of the data268 suggests that there may be a small effect of the −308 polymorphism, with the TNF2 allele being associated with slightly higher levels of TNFα production. An association between TNF2 allele carriers (heterozygous and homozygous) and breast cancer risk was shown in one study which included 40 breast cancer patients and 106 controls (OR=3.53, 95% CI 1.65-7.54).267

    Heat shock protein 70 (HSP70) gene

    In the HLA region, three intronless genes encoding members of HSP70 are located centromerically to the TNF genes. The genes have been identified as HSP70-1, HSP70-2, and HSP70-hom.271 HSP70 is a determining factor in immunological mechanisms against tumour cells267 and HSPs can serve as a target for anti-tumour immune recognition by antibodies and T cells.272,273 Conversely, HSP70 expression on tumour cells is correlated with the inhibition of monocytotoxic activity, which can protect tumour cells against host immunological reactions.267 Whether HSP70 acts as an anti-tumour immune response enhancer or as a tumour promoter may depend on HSP70 genotypes.267 One study showed that the variant allele carriership of the HSP70-hom gene was associated with breast cancer (OR=3.56, 95% CI 1.26-10.01), whereas carriership of the HSP70-2 gene was not (OR=2.36, 95% CI 0.75-7.33).267

    Iron metabolism

    Experimental, clinical, and epidemiological investigations have shown that iron can influence carcinogenesis.274 Increased body iron stores have been associated with cancer risk. A number of genes are involved in iron metabolism, including the haemochromatosis gene (HFE) and the transferrin receptor (TFR) gene.

    The HFE gene and hereditary haemochromatosis (HH)

    So far, two point mutations (Cys282Tyr and His63Asp) have been detected in the HFE gene of HH patients. Over 80% of haemochromatosis patients are homozygous for the Cys282Tyr mutation.275 Heterozygous carriers, comprising 15% of the American population, have, on average, increased iron stores as compared to non-carriers.276,277 In a study of 1950 HH heterozygotes and 1656 controls, no increased breast cancer risk was detected (OR=0.98, 95% CI 0.81-1.19).277 In another study with 165 cases and 294 controls, no association was found between breast cancer and the HFE and TFR genotypes when the genotypes were tested both separately and together.278 In conclusion, the HFE and TFR genes do not play a major role in breast cancer susceptibility.

    Other genes

    Vitamin D receptor (VDR) gene

    Five polymorphisms of the VDR gene have been studied in breast cancer patients and controls. Four of these, the TaqI, ApaI, BsmI, and the poly-A polymorphism, are located in the 3′ region of the gene and are in linkage disequilibrium with each other. One polymorphism, FokI, is located in the 5′ region of the gene and is not in linkage disequilibrium with the other polymorphisms. The TaqI polymorphism, associated with increased serum vitamin D3 levels,279 was examined in three studies.247,279,280 No association with breast cancer was found in our pooled analysis (fig 3). The BsmI polymorphism was investigated in two studies and, when they were combined, again no association with breast cancer was found (fig 3).281,282 The two other polymorphisms in the 3′ region of the gene, the ApaI and the poly-A polymorphism, were each addressed in one small study. For both polymorphisms, an increased breast cancer risk was found for carriers (heterozygous and homozygous) of the variant allele (ApaI, OR=1.56, 95% CI 1.09-2.24280; poly-A, OR=1.73, 95% CI 1.16-2.59282; fig 3). Pooled analysis of the two studies on the FokI polymorphism280,282 detected no association with breast cancer (fig 3). In one study, haplotypes of the ApaI (variant allele a) and the TaqI (variant allele T) polymorphisms were tested. Women with the genotype aaTT (homozygous for the haplotype of both variant alleles) had an increased breast cancer risk (OR=2.5, 95% CI 1.02-6.5) as compared to women with the genotype Aatt.280 The results of the different polymorphisms in this gene are contradictory, and it remains unclear whether the VDR gene plays a role in breast cancer susceptibility.

    The APC gene

    Breast tumours (n=227) were screened for truncating mutations in exon 15 of the APC gene (77% of the coding sequence) and only one somatic mutation was found.283 Somatic mutations in the APC gene were detected in 13 of 70 breast cancer cases in another study.284 Most of these mutations were outside the mutation cluster region that has been noted for colorectal cancer.284 One of the polymorphisms in the APC gene, the I1307K polymorphism, is specific to Ashkenazi Jews. In Ashkenazi Jewish women with breast cancer without a BRCA1 or BRCA2 mutation, no association between breast cancer and the I1307K polymorphism was detected.285,286 In another study, the presence of the I1307K allele among BRCA1/2 carriers was not associated with a further increase of cancer risk.287 This polymorphism probably does not play a role in breast cancer susceptibility.

    Combinations of polymorphisms in different genes

    One study examined the four polymorphisms in the CYP1A1 gene and the GSTM1 and GSTT1 polymorphisms.169 None of these polymorphisms, either separately or combined, was associated with increased breast cancer risks. Others analysed a combination of the m2 polymorphism in the CYP1A1 gene and the GSTM1 polymorphism and again no associations were found.167 One study219 examined the m1 polymorphisms in the CYP1A1 gene and the polymorphisms in the CYP17 and COMT genes. The presence of at least two putative high risk genotypes was associated with an increased risk of breast cancer (OR=3.47, 95% CI 1.21-9.99).

    In conclusion, in a few studies with small sample sizes, combinations of polymorphisms were examined.

    DISCUSSION

    This review, which examined 34 polymorphisms in 18 different genes, described in more than one breast cancer study, whenever possible with pooled analysis, showed an association with breast cancer for 13 polymorphisms in 10 genes. Increased breast cancer risks were found for the polymorphisms in HRAS1, GSTM1, GSTP1, CYP1B1 (codon 119), CYP2D6, CYP19, and VDR (ApaI and poly-A), with PARs ranging from 1-41%. Interestingly, decreased breast cancer risks were found for women homozygous for the variant allele for the intron 3, exon 4, and intron 6 polymorphisms in the Tp53 gene, the XbaI polymorphism in the ER gene, and the PROGINS polymorphism in the PR gene. Women with these genotypes may represent a subpopulation where prevention strategies can be less intensive than in the general population. The pooled analysis was performed on large (>2000 cases) sample sizes for the HRAS1, GSTM1, and CYP19 polymorphisms. There is, therefore, strong evidence for increased breast cancer risks associated with these polymorphisms, although the increase in breast cancer risk for the GSTM1 polymorphism is very small. More research on these three genes will probably only narrow the confidence intervals and not change either the ORs or the allele frequencies. The sample sizes studied for other polymorphisms, such as Tp53 (intron 3, exon 4, and intron 6), GSTP1, CYP1B1 (codon 119), ER (XbaI), and VDR (ApaI and poly-A), were quite small (<1000 cases). Because of this, the association with increased breast cancer risk is not confirmed. The sample sizes for the polymorphisms of CYP2D6 and PR were intermediate (between 1000 and 2000 cases). Our pooled analysis indicated that there is an association with breast cancer, although more research (larger sample size) could slightly change both the ORs and the allele frequencies. The pooled analysis for 12 other polymorphisms in nine genes, namely L-myc, NAT1, NAT2, GSTT1, CYP1A1, CYP17, AR, COMT, and UGT1A1, showed no association with breast cancer. For NAT2, GSTT1, and CYP17, the polymorphisms with large sample sizes, an association with breast cancer can be excluded. For polymorphisms with small (L-myc, NAT1, CYP1A1 (m2, m3 and m4), AR, and UGT1A1) or intermediate (CYP1A1 (m1) and COMT) sample sizes, an association with breast cancer cannot be excluded.

    Somewhat different rules are applicable for polymorphisms described in only one study. To conclude from a negative result that the original effect is likely to be an artefact, a sample size of roughly four times the initial study is needed when replicating these studies (see Appendix 1). Eight polymorphisms (TNF-α,267HSP70-2,267HSP70-hom,267APOE,288EDH17B2,289HER2,290TBR-1,291 and TFR278) are each described in only one study, all of which have (very) small sample sizes. A weak association with breast cancer was found for five of these (TNF-α, HSP70-2, HSP70-hom, EDH17B2, and HER2). Replication of this is needed either to confirm or reject the tentative findings.

    Strikingly little research has been performed on combinations of polymorphisms which are addressed in only a few studies in breast cancer patients. An association with increased breast cancer risk was found for combinations of polymorphisms in the different GST genes, although the total number of cases (n=238) and controls (n=240) was small and the association was only marginally significant. No evidence was observed for other associations with breast cancer for certain combinations, but this was mainly because of small sample sizes. For polymorphisms not associated with breast cancer when studied separately, an association is still possible in combination with other polymorphisms. Since the products of several genes interact (almost half of the reviewed genes play a role in oestrogen metabolism), interactions between the genes are likely.

    When the variant itself is non-functional, but in linkage disequilibrium with some other functional variant, the overall risks may not be applicable to all populations, as linkage disequilibrium for certain variants often differ between populations.21

    Finally, it is not unlikely that other genes exist that give rise to variation in breast cancer susceptibility, but have not yet been identified and/or tested. A whole genome screen would be the ideal method to detect new breast cancer susceptibility genes. This method, however, is still too expensive to carry out in large study populations. Until this is (economically) feasible, it would be useful to collect data on an appropriately sized, well described study population. Analysis of several (or all) of the polymorphisms already known to be associated with breast cancer in the same population will increase our understanding of the aetiology of breast cancer. More specific risk assessments will become available for women individually,16 with targeted breast cancer prevention strategies.13

    Appendix 1 Sample size calculation

    More than one study.

    The statistical power for actual investigations into disease genes can be tackled by postulating a simple biallelic genetic model and assuming that the disease gene itself (or a 100% linked and associated marker) is observed. Relative risks of the disease can then be specified for homozygotes for the wild type allele (aa), heterozygotes (aA), and homozygotes for the variant allele (AA). This allows for variations in penetrance and dominant, additive, or recessive models. Other input parameters that the model needs are the disease frequency, the frequency of the variant allele, the required power, and the significance level. The basis of the calculation is the determination of the a posteriori genotype distribution of the frequencies of alleles transmitted from unaffected parents to their affected child. The genotypes constructed by the alleles that are not transmitted serve then as (pseudo) controls as in the transmission/disequilibrium test. Power calculations for the difference in the frequencies of binomial entities are performed using a normal approximation. Sample sizes required are then calculated for detecting differences between cases and controls in allele frequencies, in variant allele carriers and in homozygotes for the variant allele.

    Although the assumption of 100% linked and associated markers is not entirely realistic, it does indicate an order of magnitude for the power of an actual investigation. This will probably be lower as a function of genetic distance and allelic frequency match between marker and disease alleles. The actual calculations are done on a Microsoft Excel® spreadsheet, available from the first author on request.

    Only one study.

    When only one association study is available on a polymorphism, it is likely to be a positive one. It is therefore unlikely that a reliable disease model can be obtained from it. Because of this, the sample size cannot be computed with the formula presented in the previous section. A second study will need a sample size at least four times as large as the original one in order either to replicate or refute the association. The reason for the factor 4 is as follows. Assume that the power for the first study to find a significant difference between cases and controls is equal to 0.50. For example, for a disease model with a variant allele frequency of 0.20, a disease frequency of 10%, and genotypic relative risks for aA and AA both equal to 1.5, the power to detect a difference between cases and controls at a significance level of 5% is 0.502 in a study with 164 trios. To detect the same difference with a power of 0.95, 656 trios are required, which is exactly a factor of 4 larger. Even for a weak genetic disease model with a variant allele frequency of 0.50, a disease frequency of 10% and genotypic relative risks both equal to 1.1, this factor is 3.9996 (5001 trios in first study versus 20 002 in the replication study). In order to replicate the first study with a power of 90% (instead of 95%), the required increase in sample size is a factor of 3.165.

    Acknowledgments

    This work was supported by grant RUG-98-1665 of the Dutch Cancer Society and by the Comprehensive Cancer Centre Northern Netherlands.

    REFERENCES