Main

Gliomas account for 40% of all primary brain tumours and are responsible for around 13 000 deaths in the United States of America each year (Bondy et al, 2008). To date, the only established environmental risk factor for glioma is exposure to ionising radiation (Bondy et al, 2008).

The two-fold increased risk of glioma seen in relatives of glioma patients provides evidence for inherited genetic susceptibility to the disease and a number of rare inherited cancer predisposition disorders such as Li-Fraumeni and Turcot syndromes, and neurofibromatosis are recognised to be associated with increased risk (Bondy et al, 2008). Recently, genome-wide association studies (GWAS) have demonstrated that common genetic variation also contributes to the development of glioma identifying risk single-nucleotide polymorphisms (SNPs) at six genetic loci – 5p15.33 (TERT), 7p11.2 (EGFR, two independent regions), 8q24.21 (CCDC26), 9p21.3 (CDKN2A/CDKN2B), 11q23.3 (PHLDB1) and 20q13.33 (RTEL1; Shete et al, 2009; Wrensch et al, 2009; Sanson et al, 2011). Although these GWAS SNPs have provided insight into genetic susceptibility to glioma based on the population frequency of risk alleles, collectively these SNPs account for only around10% of the 1.8-fold increased risk of all glioma seen in the relatives of patients (Hemminki et al, 2009), hence there remains considerable ‘missing heritability’. In addition to common genetic variation impacting on glioma risk, it is likely that rare disease-causing variants also contribute to disease heritability. Such variants are generally not directly identifiable through GWAS because of incomplete SNP tagging.

A TP53 mutation is among the most frequent and earliest detectable genetic alteration in glioma (Ohgaki and Kleihues, 2011). In addition to the rare inactivating germline mutations in TP53 that cause the Li-Fraumeni syndrome, it is entirely plausible that variants conferring modest risks also contribute to the risk of glioma. A number of studies have variously reported associations between specific TP53 polymorphisms and glioma risk (Rajaraman et al, 2007; Idbaih et al, 2008; Jha et al, 2011; Stacey et al, 2011; Egan et al, 2012; Shi et al, 2012), but many observations have been based on small case–control series with limited replication of findings. Most recently, rs78378222 that maps 3′ to TP53 has been implicated as a risk factor influencing glioma risk from an analysis of two case–control series (P=1.0 × 10−5; Stacey et al, 2011) and possibly patient prognosis (Egan et al, 2012).

To comprehensively evaluate the role of polymorphic variation at the TP53 locus on glioma risk, we analysed SNP genotype data in four series totalling 4147 glioma patients and 7435 controls, and imputed unobserved SNPs in cases and controls.

Materials and Methods

Subjects

We used data previously generated on four non-overlapping case–control series of Northern European ancestry, which had been the subject of previous GWAS (Shete et al, 2009; Sanson et al, 2011). Briefly, the UK GWAS was based on 636 cases (401 male; mean age 46 years) ascertained through the INTERPHONE Study (Cardis et al, 2007; Shete et al, 2009). Individuals from the 1958 Birth Cohort served as a source of controls. The US GWAS was based on 1281 cases (786 male; mean age 47 years) ascertained through the MD Anderson Cancer Center, Texas, between 1990 and 2008. Individuals from the Cancer Genetic Markers of Susceptibility (CGEMS) studies served as controls (Hunter et al, 2007; Yeager et al, 2007; Shete et al, 2009). The French GWAS study comprised of 1495 incident patients with glioma ascertained through the Service de Neurologie Mazarin, Groupe Hospitalier Pitié-Salpêtrière, Paris (Sanson et al, 2011). The controls were ascertained from the SU.VI.MAX (SUpplementation en VItamines et MinerauxAntioXydants) study of 12 735 healthy subjects (women aged 35–60 years; men aged 45–60 years; Hercberg et al, 2004). The German GWAS comprised of 880 patients who underwent surgery for a glioma at the Department of Neurosurgery, University of Bonn Medical Center, between 1996 and 2008 (Sanson et al, 2011); samples being collected while under care of Neurosurgeons. Control subjects were taken from three population studies: KORA (Co-operative Health Research in the Region of Augsburg; n=488; Holle et al, 2005), POPGEN (Population Genetic Cohort; n=678; Wichmann et al, 2005) and from the Heinz Nixdorf Recall study (n=380; Krawczak et al, 2006).

Collection of blood samples and clinical information from subjects was undertaken with informed consent and relevant ethical review board approval in accordance with the tenets of the Declaration of Helsinki.

Genotyping

Full details of the genotyping of cases and quality control using Illumina Infinium HD Human610-Quad BeadChips (Illumina, San Diego, CA, USA) are detailed in previously published work (Shete et al, 2009; Sanson et al, 2011). Briefly, duplicate samples were used to check genotyping quality. SNPs and samples with <95% SNPs called were eliminated from the analyses. Genotype frequencies at each SNP were tested for deviation from the Hardy–Weinberg equilibrium and rejected at P<10−4. We have previously confirmed an absence of systematic genetic differences between cases and controls and shown no significant evidence of population stratification in these sample sets. Genotyping of rs78378222 and rs3585075 was conducted by competitive allele-specific PCR KASPar chemistry (KBiosciences) or Taqman technology implemented on an ABI7900HT platform (Applied Biosystems, Foster City, USA); details of primers and probes used are available on request. Genotyping quality control was further evaluated through inclusion of duplicate DNA samples in SNP assays, together with direct sequencing of a subset of samples to confirm genotyping accuracy (details available on request).

Statistical analysis

Analyses were primarily undertaken using R (v2.6), STATA v.10 (State College) and PLINK (v1.06) software. The association between each SNP and risk of glioma was assessed by the Cochran-Armitage trend test. Odds ratios and associated 95% confidence intervals (CI) were calculated by unconditional logistic regression. Prediction of the untyped SNPs was carried out using IMPUTEv2 based on data from the 1000 Genomes Project (Phase 1 interim June 2011). Imputed data were analysed using SNPTEST v2 to account for uncertainties in SNP prediction. Association meta-analyses only included markers with proper_info scores >0.4 and MAFs>0.01. Meta-analyses were carried out with META using the genotype probabilities from IMPUTEv2, where a SNP was not directly typed. To test for the presence of additional independent risk alleles in each region, we carried out conditional logistic regression analysis that included SNPs with evidence of association in the meta-analysis at P<5.0 × 10−4. We calculated Cochran’s Q statistic to test for heterogeneity and the I2 statistic to quantify the proportion of the total variation that was caused by heterogeneity (Higgins and Thompson, 2002). I2 values 75% are considered to indicate substantial heterogeneity. To address heterogeneity between case–control series, we derived pooled ORs under a random effects model.

The familial relative risk of glioma attributable to a variant was calculated using the formula (Houlston and Ford, 1996):

where p is the population frequency of the minor allele, q=1—p, and r1 and r2 are the relative risks (approximated by odds ratios) for heterozygotes and the rarer homozygotes relative to the more common homozygotes, respectively. From λ*, it is possible to quantify the influence of the locus on the overall familial risk of glioma in first-degree relatives of glioma patients. Assuming a multiplicative interaction between risk alleles, the proportion of the overall familial risk attributable to the locus is given by log (λ*)/log (λ0), where λ0 the overall familial risk of glioma, shown in epidemiological studies is 1.8 (Hemminki et al, 2009).

Measurement of LD between SNPs were based on Data Release 27, phase 3 (Feb 2009 on NCBI B36 assembly, HapMap dbSNP126), viewed using Haploview software (v4.2) and plotted using SNP annotation and proxy search (SNAP). LD blocks were defined according to HapMap recombination rate (centimorgans per megabase), as determined using the Oxford recombination hotspots (Myers et al, 2005) and the previously set distribution of CIs (Gabriel et al, 2002).

Complete survival data, together with clinical (age at diagnosis, sex, preoperative Karnofsky performance index (KPI)), degree of resection, chemotherapy and radiotherapy) covariate data were available for 1699 patients (75%) in the French and German case series, with a median follow-up interval of 8.9 years for patients without an event. Overall survival (OS) of patients was the end point of the analysis. Survival time was calculated from the date of diagnosis of glioma to the date of death. Patients who were not deceased were censored at the date of last contact. Mean follow-up time was computed among censored observations only. Kaplan–Meier survival curves according to genotype were generated and the homogeneity of the survival curves between genotypes was evaluated using the log-rank, and Wilcoxon and Fleming–Harrington tests (Klein (1997)). The log-rank test is usually the preferred test, but the other tests were conducted to show the influence of the polymorphic variation at different times of follow-up in order to detect any difference in early and late stages of disease. Cox regression analysis was used to estimate hazard ratios and their 95% CI, whereas adjusting for age, gender, treatment and grade.

Tumour genotyping

Tumour samples were available from a subset of the French patients ascertained through the Service de Neurologie Mazarin, Groupe Hospitalier Pitié-Salpêtrière, Paris. Tumours were snap-frozen in liquid nitrogen and DNA was extracted using the QIAmp DNA minikit, according to the manufacturer’s recommendations (Qiagen, USA). DNA was analysed for large-scale copy-number variation by CGH-array as previously described (Simon et al, 2010). In the cases not analysed by CGH-array, 9p, 10q, 1p and 19q status were assigned using PCR microsatellites, and EGFR amplification and CDKN2A-p16-INK4a homozygous deletion by quantitative PCR. IDH1 codon R132 status was determined by sequencing. A search for mutations in exons 2-11 of TP53 mutations was conducted using previously reported primers and methods (Idbaih et al, 2007).

Results

We studied four non-overlapping glioma case–control series of Northern European ancestry providing data on 4147 cases and 7435 controls. We considered a 200-kb region from 100 kb 3′ to 100 kb 5′ of TP53 (rs11870250 at 7 471 783 bps to rs191403470 at 7 678 716 bps). Genotypes for 23 SNPs mapping to this region were available from each of the four GWAS data sets. Confining our analysis to the relationship between 17p13.1 variation and glioma risk, 8 of the 23 SNPs provided evidence for an association at P<0.05 (Figure 1). The strongest association was provided by rs1619016, which maps at 7 550 553 bps (OR=1.16, per allele P=4.93 × 10−4; Phet=0.17, I2=41%; Figure 1).

Figure 1
figure 1

Case–control association plot for glioma and genomic structure of the chromosome 17p13.1 region (A) All glioma, (B) GBM, (C) non-GBM tumours. Overview of single-point SNP association results of both genotyped (triangles) and imputed (circles) SNPs are shown with recombination rates within the region. The region shown is chr17:7,471,719-7,678,811 (hg18 Built 36). −log10 P values (y axis) of the SNPs are shown according to their chromosomal positions (x axis). The top genotyped SNP in each combined analysis is a large CIRCLE and is rs78378222. The colour intensity of each symbol reflects the extent of LD with the top genotyped SNP, from white (r2=0) to dark gray (r2=1.0). Genetic recombination rates (cM/Mb), estimated using HapMap CEU samples, are shown with a light blue line. Also shown are the relative positions of genes and transcripts mapping to the region of association.

Using these data, we sought to establish whether we could identify SNPs better correlated with the risk of glioma through imputation of untyped SNPs referencing the 1000 Genomes project. Of a total of 6 403 imputed SNPs, 2,377 passed post-imputation QC filtering. Thirty-one of these SNPs provided superior evidence for an association with glioma risk than that provided by rs1619016 (Figure 1).

The strongest signals were shown by rs78378222, which maps 3′ to TP53, and rs35850753, which maps 5′ to TP53 (P<10−24). These associations are many orders of magnitude stronger than the association with rs1619016. Imputed information statistics for rs78378222 and rs3585075 SNPs were 0.66 and 0.61, respectively. To validate imputation of rs78378222 and rs3585075, we directly typed these SNPs in available DNAs from three of the four case–control series. Specifically, from the French GWAS we genotyped in all 1423 cases and 1827 controls, 816 of the German cases, all 631 UK cases and 700 UK controls. In each of these data sets, the concordance between imputed and directly typed SNP genotype was 99.0% for rs78378222 and 98.5% for rs3585075 (r2=0.90 and 0.89 for correlation between directly typed and imputed SNP genotypes, respectively).

The association between rs78378222 and glioma risk was shown in each of the four case–control studies (Figure 2), with the pooled OR under a fixed-effects model being 3.74 (95% CI: 2.89–4.83; P=6.86 × 10−24), however, there was significance between-study heterogeneity (Phet=0.045, I2=63%). Under a random effects model, the pooled OR was 3.65 (95% CI: 2.38–5.59; P=2.83 × 10−9). Between-study heterogeneity was largely ascribable to the German case–control series, which showed the weakest association. Omitting this study, the pooled OR was 4.45 (95% CI: 3.33–5.94; P=3.72 × 10−24; Phet=0.50, I2=0%).

Figure 2
figure 2

Association between glioma risk and rs78378222, for all glioma, GBM and non-GBM. Summary of association results for rs78378222. (AC) Forest plots of rs78378222 per allele odds ratio (OR) for (A) all glioma, (B) GBM and (C) non-GBM. The x axis corresponds to the OR. Horizontal lines represent 95% confidence intervals. Each box represents the OR point estimate and its area is proportional to the statistical weight of the study. The diamonds (and broken lines) represent the summary odds ratios obtained from fixed-effect pooled analysis with 95% confidence intervals given by their widths. The unbroken vertical line is at the null value (OR=1.0). For all glioma, the MAFs for each study are: French cases: 0.024, French controls: 0.008, German cases: 0.023, German controls: 0.015, USA cases: 0.032, USA controls: 0.012, UK cases: 0.030, UK controls: 0.014.

rs78378222 is in strong LD with rs3585075 (r2=0.94). Conditional analysis is consistent with the 17p13.1 association being attributable to rs78378222 variation per se. Furthermore, the risk allele of rs78378222 has been shown to be directly functional leading to impaired 3′-end processing of TP53 mRNA, thereby presumably impacting on cellular availability of TP53 (Stacey et al, 2011). Logistic regression analysis of SNPs with evidence of association at P<5.0 × 10−4 did not identify any additional independent risk alleles in the region with variation at rs78378222 genotype sufficient to capture the variation impacting on glioma risk.

Gliomas are heterogeneous and different tumour subtypes defined in part by malignancy grade (e.g., pilocytic astrocytoma WHO grade I, diffuse ‘low grade’ glioma WHO grade II, anaplastic glioma WHO grade III and glioblastoma multiforme (GBM; WHO grade IV)) can be distinguished. Accumulating data have established that the subtypes of glioma have different molecular profiles possibly resulting from different etiologic pathways. In view of this using data from all studies, we first examined the risk for GBM and non-GBM tumours associated with rs78378222 (Figure 2). Respective ORs for GBM and non-GBM tumours under a fixed-effects model were 5.59 (P=4.90 × 10−17) and 5.00 (P=1.47 × 10−20); under a random effects model respective ORs were 5.36 (P=2.32 × 10−6) and 5.09 (P=8.64 × 10−6; Figure 2).

We then examined the association by subtype in more detail using histology data of WHO grade. Although there was an evidence of between-study heterogeneity, this analysis showed that the association was seen in all glioma grades (Figure 3). Subsequent to this, using data from the French series of directly genotyped cases and controls, we examined the association between glioma risk and rs78378222 stratified by tumour histology and molecular characteristics of tumours (Tables 1 and 2). This analysis indicated that the association was not driven by a relationship with specific cell lineage, with strong associations being shown for tumours of both oligo- and astrocyte origin (Table 1). Within non-GBM tumours, the strongest associations were shown for IDH mutation, EGFR non-amplification, 10q non-deleted and 1p/19q co-deleted. Within GBM tumours, the strongest associations were shown for tumours that were 1p-19q non-codeleted and 10q non-deleted (Table 2). For 231 of the cases, TP53 mutational status in tumours was available. Although the frequency of the risk allele for rs78378222 was higher in TP53-mutated tumours, the difference was not significant, 0.045 vs 0.028, respectively (P=0.49).

Figure 3
figure 3

Association between glioma risk and rs78378222, for grade II, III and IV glioma. Summary of association results for rs78378222. (AC) Forest plots of rs78378222 per allele odds ratio (OR) for (A) grade II (B) Grade III and (C) Grade IV glioma. The x axis corresponds to the OR. Horizontal lines represent 95% confidence intervals. Each box represents the OR point estimate and its area is proportional to the statistical weight of the study. The diamonds (and broken lines) represent the summary odds ratios obtained from fixed-effect pooled analysis with 95% confidence intervals given by their widths. The unbroken vertical line is at the null value (OR=1.0).

Table 1 Glioma risk and rs78378222 by morphology in the French case–control series
Table 2 Glioma risk and rs78378222 stratified by morphology type: IDH1 mutation, EGFR amplification, CDKN2A-p16-INKa homozygous deletion, 9p and 10q loss, 1p-19q co-deletion status in the French case–control series

To explore the possibility that rs78378222 genotype may influence tumour progression, we examined the relationship between directed typed genotypes and patient outcome in the French (n=1106) and German (n=593) case series. Survival analysis by each grade and adjusting for age at diagnosis, sex, chemotherapy and radiotherapy provided no evidence for an independent relationship between rs78378222 genotype and OS in either of patient cohort (Figure 4) with log-rank tests all yielding P-values >0.1. Similarly, Wilcoxon and Fleming–Harrington tests were also all non-significant (P>0.1). After inclusion of KPI data, all results remained non-significant (data not shown).

Figure 4
figure 4

Kaplan–Meier curves for glioma patients according to rs78378222 genotype. Survival curves for the common homozygotes (AA) are shown as a solid line. The dashed line depicts the survival curve for the heterozygotes (AC) and rare homozygotes (CC). (A) French grade II cases (HR=0.95, 95% CI: 0.40–2.28); (B) French grade III cases (HR=1.51, 95% CI: 0.74–3.09); (C) French grade IV cases (HR=1.24, 95% CI: 0.70–2.18); (D) German grade II cases (HR=3.11, 95% CI: 0.27–35.88); (E) German grade III cases (HR=NA); (F) German grade IV cases (HR=1.23, 95% CI: 0.68–2.21).

Discussion

Our findings provide strong evidence that the rs78378222 variant has a role in determining the risk of developing glioma, thereby confirming the recent observation made by Stacey et al (2011). Moreover, the strength of the association we have demonstrated provides unambiguous evidence for rs78378222 being a risk factor for glioma.

Glioma is increasingly being viewed as a highly heterogeneous cancer. Primary and secondary forms of GBM are recognised; with secondary GBM developing through progression from low-grade diffuse astrocytomas or anaplastic astrocytomas. Although usually indistinguishable histologically, distinct molecular pathways characterise the primary and secondary forms. Notably, IDH1 mutations are commonly detectable in low-grade glioma and secondary GBM, but are rare in primary GBMs (Ohgaki and Kleihues, 2007; Nobusawa et al, 2009). Furthermore, there is increased evidence that each of the non-GBM forms of glioma are typified by certain molecular characteristics, which may well reflect different genetic predispositions. In this study, we investigated the impact of genetic variation on risk of glioma by tumour subtype. Although the effect of rs78378222 on risk is in part generic, there was evidence for stronger associations being displayed by tumours in which EGFR was not deleted.

Major strengths of our study are its overall and individual study size; that these data have been systematically ascertained and that by making use of GWAS data bias from population stratification, confounding has been avoided. This type of bias is especially relevant in studies of low frequency variants. Furthermore, we were able to demonstrate high concordance between imputed and directly typed SNPs.

We found no evidence that rs78378222 genotype is an independent predictor of prognosis for glioma in either of the two case series studied. Stratification by histological subtype did not impact on the significance of findings. Our observations are therefore in contrast with those recently reported by Egan et al (2012), supporting a strong correlation between rs78378222 genotype and OS in an analysis of 566 glioma patients. Even though bias from non-uniform treatment is a potential confounder in studies of some cancers, the management of glioma is limited, as there are only a restricted number of chemotherapeutic agents and prognosis is uniformly poor. Even accepting that there may be differences in clinical management of glioma between centres, no association between genotype and outcome was seen in either the German or the French case series. It is therefore unlikely that any spurious influence as a consequence of study design will have impacted significantly on a failure to demonstrate a relationship between genotype and OS. It is unlikely a priori that a single SNP, albeit one with functional effects, will impart substantial differences in glioma prognosis independently and it is not uncommon for the first published studies to report over-inflated estimates of effects, which subsequently larger studies cannot replicate. A caveat to such an assertion is that carriers of rs78378222 are rare and hence, the power of any one study to demonstrate an association with outcome is inherently limited (i.e., <40% to detect a 1.2-fold difference on surviourship in the French and German case series, stipulating a P-value of 0.05).

rs78378222 localises within the 3′ untranslated region of TP53 with the risk variant changing the AATAAA polyadenylation signal to AATACA (Stacey et al, 2011). This nucleotide substitution has been shown to lead to impaired 3′-end processing of TP53 mRNA, thereby presumably impacting on cellular availability of TP53. Given the role of inactivating TP53 mutations in the development of glial tumours, an association between rs78378222 and glioma risk mediated through loss of TP53 function is therefore eminently plausible.

Although the allele frequency of rs78378222 is only around 1% in the European population, the associated genotype risks translate to the variant accounting for 6% of the familial risk of glioma. This single uncommon SNP therefore accounts for a greater percentage of the familial risk than the collective impact of all previously identified common risk SNPs for glioma.

In summary, our data provide robust evidence for rs78378222 being a determinant of glioma risk. Furthermore, our study demonstrates the use of imputation of untyped genotypes as a powerful means of exploiting existing GWAS data sets to identify low-frequency disease-causing variants.

URLs

The R suite can be found at: http://www.r-project.org/. Detailed information on the tagSNP panel can be found at: http://www.illumina.com/

dbSNP: http://www.ncbi.nlm.nih.gov/projects/SNP/

HAPMAP: http://www.hapmap.org/

1958 Birth Cohort: http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=724&sitesectiontitle=Welcome+to+the+1958+National+Child+Development+Study

CGEMS: http://cgems.cancer.gov/

SNPTEST: https://mathgen.stats.ox.ac.uk/genetics_software/snptest/snptest.html

KORA: http://epi.helmholtz-muenchen.de/kora-gen/index_e.php

POPGEN: http://www.popgen.de/