Background Five single nucleotide polymorphisms (SNPs) were previously reported to be associated with thyroid cancer in European populations in two genome-wide association studies (GWAS): rs965513 (9q22.33), rs944289 (14q13.3), rs116909374 (14q13.3), rs966423 (2q35) and rs2439302 (8p12). Only the first two SNPs have been validated in independent populations and none were replicated in Chinese populations.
Methods The above five SNPs were genotyped in 845 papillary thyroid cancer (PTC) and 503 benign thyroid tumour (BN) patients and 1005 controls in a Chinese population using the SNaPshot multiplex single nucleotide extension system.
Results Significant associations were detected among PTC and rs944289 (p=8.007e-11), rs965513 (p=1.013e-4), rs966423 (p=1.688e-3) and rs2439302 (p=1.096e-4) in a dominant model, while the rs116909374 SNP was not detected in the Chinese population. The PTC risk increased with rise in accumulative numbers of risk alleles carried by individuals (p=5.929e-13). The PTC OR of carriers of six risk alleles (1.4% of the control population) was 23.587 compared with non-risk homozygotes (1.0% of the control population, with zero risk alleles). No individuals were homozygous for all the four SNPs (carriers of eight risk alleles) and only three PTC cases were carriers of seven risk alleles. A significant association between 14q13.3 SNP rs944289T and BN was also found (p=0.0014).
Conclusions Four candidate loci, rs965513 (9q22.33), rs944289 (14q13.3), rs966423 (2q35) and rs2439302 (8p12), identified by GWAS for PTC risk were confirmed in a Chinese population. The PTC risk of accumulative risk allele carriers increased with the number of risk alleles.
- Cancer: endocrine
- Genetic epidemiology
- Thyroid disease
Statistics from Altmetric.com
Thyroid cancer is the fifth most common type of female cancer and its incidence is increasing.1 Papillary thyroid cancer (PTC) accounts for 85% of non-medullary thyroid cancer and ionising radiation is a well-established risk factor.2 However, it has long been postulated that individuals differ in their susceptibility to PTC, and there is increasing evidence from epidemiological studies for a familial risk.3 ,4 Progress in the identification of specific PTC susceptibility loci and genes has been slow, mainly because of inadequate study design, underpowered sample sizes and preferential reporting of false-positive findings. In addition, some loci and genes have been associated with PTC risk in one study but not replicated in other independent studies.4––8
The first PTC genome-wide association study (GWAS) reported susceptibility loci on chromosomes 14q13.3 (rs944289) and 9q22.33 (rs965513) in a European population, providing further evidence for a genetic contribution to PTC.9 A more recent GWAS carried out on radiation-related PTC cases in Chernobyl identified significant associations with 9q22.33 but found none with 14p13.3.10 However, replication studies in UK and Japanese populations confirmed that both 14q13.3 and 9q22.33 were associated with a PTC risk and no replication study was done for Chinese population.11 ,12 Three new variants were also shown to associate with thyroid cancer risk in a recent European population GWAS: rs966423 on 2q35, rs2439302 on 8p12 and rs116909374 on 14q13.3. These single nucleotide polymorphisms (SNPs) were primarily investigated for their association with the levels of thyroid stimulating hormone and no independent study has been carried out to replicate these results.13
To confirm the importance of these candidate loci, we genotyped rs965513 (9q22.33), rs944289 (14q13.3), rs116909374 (14q13.3), rs966423 (2q35) and rs2439302 (8p12) in 845 PTC, 503 benign thyroid tumour (BN), including follicular adenomas and hyperplastic nodules, and 1005 controls from a Chinese population. As the chromosomal region 8q24 (rs6983267) has been associated with PTC risk in a European population,6 ,11 ,14 we also genotyped the rs6983267 polymorphism in the same Chinese population to determine its status.
The case population consisted of 846 PTC and 506 BN patients treated at the Department of Head and Neck Surgery, Cancer Hospital, Fudan University, Shanghai, China from January to December 2010. All subjects were ethnically Chinese Han and came from Eastern China, including Shanghai, Jiangsu, and surrounding regions. Enrolling criteria included a histological diagnosis, no previous surgical or medical treatment of thyroid disease, and no history of radiation exposure. The control population initially included 1006 cancer-free individuals recruited from the Taizhou Longitudinal Study during the same period for which the selection criteria included no individual history of cancer and thyroid disease based on the records.15 The Taizhou Longitudinal Study is a large prospective cohort study and the baseline investigation includes interviewer-administered questionnaire, anthropometric measurements, medical exposure, and collection of buccal mucosal cells and blood specimens. A follow-up survey will be conducted to obtain the information of environmental exposures (including diet, lifestyle, occupational exposures, etc), disease occurrence and disease mortality. This study was approved by the Ethical Committee of the Cancer Hospital, Fudan University and informed consent was obtained from all subjects.
The management schema of thyroid nodules at the Cancer Hospital, Fudan University has been described previously.16 ,17 Briefly, all patients underwent an ultrasound examination prior to surgery. Fine needle aspiration and CT scans were performed individually. Lobectomy with pathological frozen section examination was routinely performed during the operation. If a malignant diagnosis was reported intraoperatively, lymph node dissection and contra-lateral lobectomy were performed according to the risk stratification of the disease.
DNA extraction and genotyping
Genomic DNA was extracted from peripheral blood samples using the LifeFeng kit (Shanghai Lifefeng Biotechnology Co, Shanghai, China) according to the manufacturer's instructions. SNPs were genotyped with the SNaPshot multiplex single nucleotide extension system as reported previously.18 Primers for PCR amplification were designed using primer designing software Primer Premier V.5.0 according to the reference sequence from the SNP database. Details of SNPs and primers are listed in table 1. All SNP loci were genotyped with an ABI PRISM V.3730 Genetic Analyser and Peak Scanner software V.1.0 (Applied Biosystems). Internal positive and negative control samples were included to control the genotyping quality.
Differences between selected variables and the Hardy–Weinberg equilibrium were evaluated using the χ2 test and Student t test as appropriate. The association between the case-control status and each SNP, measured by the OR and its corresponding 95% CI, was adjusted for age and gender. In all analyses, the common homozygote genotype in the control population was defined as the reference category. Age was considered to be a continuous variable. The SNP effects were fitted to four models of inheritance: codominant, dominant, recessive and overdominant. The best model was selected using Akaike information criteria (AIC).
SNP–SNP interactions were analysed using the R 2.15.2 with SNPassoc package.19 A p value<0.05 was considered statistically significant. Statistical analyses were performed using SPSS Software V.12.0 (SPSS, Chicago, Illinois, USA) and R 2.15.2 with SNPassoc.19
Bioinformatic analysis of SNPs
To predict potential functional changes caused by SNPs, we carried out bioinformatic analyses. Han Chinese from Beijing (CHB) and Singapore Chinese with South China ancestries (CHS) genotype data were downloaded from the 1000 Genome FTP server20 and micro-RNA target sites were downloaded from the TargetScan Micro-RNA Regulatory Sites System. Transcription factor binding site (TFBS) binding peaks files were obtained from the Encode project21 of UCSC.22 The intersection analysis were then performed to determine whether SNPs altered TFBS or micro-RNA target sites by genome arithmetic toolset bedtools.23 Perl scripts were used to construct prescriptive formats for Haploview, which were applied for linkage disequilibrium (LD) analysis.24
Genotyping success rates were 99.96% (2357/2358), 99.92% (2356/2358), 100%, 99.96% (2357/2358), 99.96% (2357/2358) and 100% for rs965513, rs944289, rs116909374, rs2439302, rs966423 and rs6983267, respectively. No T allele was observed for rs116909374 at 14q13.3 and thus the SNP was eliminated from further study. For the purpose of multi-SNP analysis, cases and controls with failed genotypes were rejected from further analysis, and so a total of 845 PTC, 503 BN patients and 1005 controls were included in the final analysis set.
Genotype distributions of all SNPs conformed to Hardy–Weinberg equilibrium in both cases and controls. The male to female ratios of PTCs, BNs and controls were 0.37 (230 : 615), 0.34 (129 : 374) and 0.45 (310 : 695), respectively (p=0.069). There was no significant difference in age between PTCs and controls (mean±SD 46.30±11.03 vs 47.20±12.29, p=0.098), while BN cases (mean±SD 48.54±11.74) were significantly older than controls (p=0.044). These differences were adjusted in the association studies by logistic regression with age and gender.
SNP association with thyroid tumour risk
Four of the five SNPs examined, 14q13.3 SNP rs944289, 9q22.33 SNP rs965513, 2q35 SNP rs966423 and 8p12 SNP rs2439302, showed a significant association with PTC risk in the Chinese population (table 2). The AIC values of four models of inheritance are listed in the online supplementary table S1. All risk alleles had an effect on the increased PTC risk in a dominant model with lowest AIC values. In addition, the p values after Bonferroni correction for rs944289, rs965513, rs966423 and rs2439302 were 4.004e-10, 5.063e-4, 8.437e-3 and 5.478e-4, individually. The p values after false discovery rate correction for rs944289, rs965513, rs966423 and rs2439302 were 4.004e-10, 1.827e-4, 2.110e-3 and 1.827e-4, respectively. However, the association between rs6983267 at 8q24 and PTC could not be replicated.
Because the published GWAS studies reported ORs for the multiplicative model, we used the same method to calculate the ORs for the four GWAS indentified loci to compare our replicated results with the GWAS data. As presented in online supplementary table S2, the ORs of rs944289, rs965513, rs966423 and rs2439302 of current results were 1.51, 1.53, 1.32 and 1.40, respectively, while they were 1.37, 1.75, 1.34 and 1.36, respectively, in the GWAS results.9 ,13 Although the exact OR values were different, the ordinal trend is the same and the strongest association with PTC was found for rs965513.
The association with BN was also found for rs944289 (p=0.0070). Using genetic power calculator,25 the statistical power between BN and controls was estimated by defining the prevalence of BN as 0.65.26 The statistical power for rs944289, rs965513, rs966423, rs2439302 and rs6983267 between BN and controls was 1.000, 0.570, 0.637, 0.073 and 0.057, respectively. When the genotype distribution of the five SNPs between PTC and BN was analysed, significant difference was found for rs2439302 after being adjusted by age and gender. The p value in the multiplicative model and codominant model after Bonferroni correction were 0.0003 and 0.005, respectively.
SNP–SNP pairwise analysis
No significant SNP–SNP interactions were found among the four SNPs (rs944289T, rs965513, rs966423 and rs2439302), which showed significant associations with PTC risk, either using SNPassoc (table 3) or the logistic test, ‘fast-epistasis’ in PLINK and TIH in GenomeMatrix (data not shown). However, the additive effects of multiple risk alleles on PTC risk were significant (table 3). When the four SNPs were analysed as two-SNP pairs, the estimated ORs of risk allele carriers were increased when another risk allele was added to form different genotype combinations (table 4). For example, with the CC (rs944289)-GG (rs965513) genotype as the reference, the CT+TT (rs944289)-GA+AA (rs965513) genotype has the highest OR scores compared with either the CT+TT (rs944289)-GG (rs965513) or CC (rs944289)-GA+AA (rs965513) genotype. Increase in OR scores was caused by additive risk alleles.
Combined effects of rs965513, rs944289, rs2439302 and rs966423 on PTC risk
Since the additive effects of two-SNP pairs were identified and yet no SNP–SNP interactions were observed, the combined effects of the four SNPs on PTC risk were estimated by summing the individual accumulation of risk alleles as a variable for logistic regression analysis. As shown in table 5, the PTC risk increased with an increasing accumulative number of risk alleles after adjustment for age and gender (p=5.929e-13). The OR score of PTC for carriers of six risk alleles (1.4% of the Chinese population) was 23.587 compared with individuals who are non-risk homozygous (1.0% of the Chinese population, with zero risk alleles). No individuals were homozygous at all the four SNPs (carriers of eight risk alleles) and only three PTC cases were carriers of seven risk alleles (0.4% of PTC cases).
Bioinformatic functional analysis of SNP rs966423 on 2q35 and rs2439302 on 8p12
Functional analyses have not previously been carried out for the two most recently identified SNPs, rs966423 and rs2439302, located in the introns of DIRC3 and NRG1 genes, respectively. No micro-RNA binding sites were identified around the SNPs using the TargetScan miRNA Regulatory Sites System prediction method. To predict their function, we searched the locus information of the Encode Project.21 ,27 All 1296 TFBS Chip-seq experiments, involving 200 transcription factors, were downloaded from the UCSC database. rs966423 Was found to be located within the CTCF binding region in cell lines GM12873, GM12875 and human adrenocortical carcinoma (HAC), while rs2439302 was located within the CTCF binding region in cell lines GM12878, AG09319 and HCT116 (figure 1).
Since both rs966423 and rs2439302 located at CTCF binding region, further analysis of CTCF expression differences between PTC and normal thyroid tissues was done. We collected 544 thyroid samples from the the cancer genome atlas (TCGA) Project (http://cancergenome.nih.gov/), which include 486 primary thyroid cancer and 58 adjacent normal tissues. The expression level of 20 532 genes was detected with RNA-sequencing technology for these 544 samples, and normalised counts of the CTCF gene of each individual were extracted from the TCGA data (see online supplementary table S3). We calculated the p value with Bonferroni adjustment from two-tail t test. CTCF expression in primary thyroid cancer was significantly lower than that of adjacent tissues (p value=0.0048, with Bonferroni correction, online supplementary table 4).
This replication study was based on variants that were previously found to be associated with PTC risk in European populations using a GWAS approach. We found that rs965513 (9q22.33), rs944289 (14q13.3), rs966423 (2q35) and rs2439302 (8p12) were associated with increased PTC risk using a relatively large sample set from a single institution-based, homogeneous Chinese population. We also found that rs944289 on 14q13.3 was also associated with an increased BN risk. SNP–SNP pairwise analysis showed that additive effects of multiple risk alleles on the PTC risk were significant, although no interaction was found by the multi-method technique. The individual PTC risk was shown to increase with the rise in accumulative numbers of risk alleles, and the PTC OR for carriers of six alleles (1.4% of the Chinese population) was 23.587 compared with non-risk homozygous individuals (1.0% of the Chinese population). Further functional prediction by bioinformatic analysis showed that the recently identified SNPs rs966423 (2q35) and rs2439302 (8p12) were located within CTCF binding regions, which might change the transcriptional regulation of DIRC3 and NRG1, respectively. However, the association between rs6983267 at 8q24 and thyroid tumour risk was not replicated.
The 9q22.33 SNP rs965513 was first reported in a GWAS of PTC in a European population, and has since been replicated in several later studies including the present one.9 ,13 It has been suggested to tag a functional variation near the FOXE1 gene which contributes to an increased risk of developing thyroid cancer.11 In another candidate gene association study, SNPs in a LD block spanning the entire FOXE1 gene showed the strongest evidence of association with PTC susceptibility in a Spanish population. Association study and functional assessment proposed that rs1867277, an SNP within the FOXE1 promoter, is a causal variant of PTC.28 A later candidate gene approach found an association between FOXE1 variants and thyroid cancer risk in a Portuguese population.29 While the LD between rs965513 and rs1867277 is significantly weaker in populations of Asian ancestry (D′=0.014 and r2=0.00 based on results from the 1000 Genomes Project), fine mapping studies of 9q22.33, especially involving other FOXE1 variants, in non-European populations warrant further scrutiny.
The 14q13.3 SNPs rs944289 and rs116909374 are associated with increased PTC risk in European populations,9 ,13 although no T allele was found for rs116909374 in the Chinese population of the current study. The two SNPs are, however, only correlated to a very small degree (D′=0.35 and r2=0.005 based on results from 3693 Icelanders), suggesting that their effects on the PTC risk use different mechanisms.13 rs944289 Was previously suggested to tag the TTF-1/NKX2.1 gene,9 ,12 and to predispose individuals to PTC through a long intergenic non-coding RNA gene (PTCSC3) located 3.2 kb downstream with characteristics of a tumour suppressor.30 In another candidate gene association study, the germline missense mutation A339V in thyroid transcription factor-1 (TTF-1) contributed to a predisposition for multinodular goitre and/or PTC in a Chinese population.8 In the current study, LD analysis of the neighbouring region to rs944289 (255 SNPs from chr14:36605081–36664828 including rs944289 and PTCSC3) was conducted using 197 CHB and CHS singleton data from the 1000 Genome Project. Ten blocks were identified in this region, and the fourth block (from chr14:36,633,415–36,664,828) was shown to contain rs944289 and part of PTCSC3 (chr14:36,604,916–36,645,857). No LD was found for the A339 mutation and rs944289, suggesting that PTCSC3 is more likely to be linked to rs944289 than TTF-1.
This study is the first to replicate the association of rs966423 (2q35) and rs2439302 (8p12) with thyroid cancer risk in an independent sample set. rs966423 is located within an intron of DIRC3, a non-coding RNA that forms DIRC3-HSPBAP1 fusion transcripts and appeared to affect normal HSPBAP1 function within t(2;3) (q35;q21)-positive kidney cells in a case of familial renal cell cancer.31 rs2439302 is located within an intron of NRG1, a ligand for the ERBB protooncogene that mediates cell–cell interactions and growth of organ systems. NRG1 polymorphisms have been shown to be associated with schizophrenia, Alzheimer's disease and Hirschsprung's disease.32 ,33 Bioinformatic analyses of the present study showed that both rs966423 and rs2439302 are located in regions that bind CTCF, a highly conserved zinc-finger factor and DNA binding protein. Further bioinformative analysis showed that CTCF expression is lower in PTC than in normal thyroid tissues. CTCF is involved in a range of regulatory functions, including transcriptional activation and repression, enhancer blocking, gene insulation, gene silencing, genomic imprinting, and the global organisation of chromatin architecture.34 ,35 The changed binding of CTCF caused by the germline change in the gene of DIRC3 and NRG1, together with the decreased expression of CTCF itself, may alter the function of DIRC3 and NRG1 gene and play an important role in the thyroid carcinogenesis.
One of the main purposes of this study was to understand the contribution to PTC risk of SNP–SNP interactions between widely validated SNPs. These interactions may correspond to or reflect biological interactions such as synergy or antagonism. Recently, there has been increasing evidence regarding the joint effect of commonly occurring SNPs on cancer risk.36 Our current work shows that the additive effects of multiple risk alleles on PTC risk are significant. Estimations of ORs for particular genotype combinations and risk alleles accumulation reveal considerable increases in the risks associated with PTC, although no significant interactions of the four SNPs were identified by the multi-method based on four independent statistical models. These results suggest that the genes tagged by the four SNPs may independently increase PTC risk, although multi-risk alleles also appear to contribute an additive effect.
The research of Jones and colleagues in the UK population reported the comprehensive results of thyroid cancer susceptibility polymorphism and the better model of the multi-SNP analysis of thyroid cancer.11 They found a 10-fold difference in risk among the 1% of the UK population that had no alleles and the 1% of the population that had multiple alleles. We find that the OR score of PTC for carriers of six risk alleles (1.4% of the Chinese population) was 23.587 compared with individuals who are non-risk homozygous (1.0% of the Chinese population, with zero risk alleles). The difference between our works and the work of Jones et al was the Tag SNPs used and population. Jones et al used the SNP at 8q24, 9q22 and 14q13 for the combined effects analysis11 while we used the SNP at 9q22, 14q13, 2q35 and 8p12 for the multi-SNP analysis because the 8q24 was not confirmed in current study and the 2q35 and 8p12 were the latest identified chromosome regions by Gudmundsson et al.13 The additive effects of 2q35 and 8p12 increased the OR values in current study. On the other hand, the frequency and effect size of risk alleles between different population may also change the OR values.
Polymorphisms rs965513 (9q22.33), rs944289 (14q13.3), rs966423 (2q35) and rs2439302 (8p12) were associated with increased risks of PTC in a Chinese population, and rs944289 was also associated with an increased BN risk. The PTC risk of carriers of risk alleles was shown to increase with accumulating number of risk alleles. Functional prediction by bioinformatic analysis showed that rs966423 (2q35) and rs2439302 (8p12) are located within CTCF binding regions, which may change the transcriptional regulation of DIRC3 and NRG1, respectively. Functional studies of the SNPs warrant further research.
We are grateful to all of the individuals who participated in this study. We are also grateful to the TCGA for providing the CTCF expression data of thyroid cancer.
Y-LW, S-HF and S-CG contributed equally to this study.
Contributors Writing Suggestion: Y-LW, Q-HJ, J-CW. Conceived and designed the experiments: Y-LW, S-HF, S-CG, W-JW, D-SL, YW, XW, Z-YW, Y-YM, LJ, Q-HJ, J-CW Performed the experiments: Y-LW, S-HF, S-CG, W-JW, Y-YM. Contributed reagents/materials/analysis tools: Y-LW, D-SL, YW, XW, Z-YW, LJ, Q-HJ, J-CW. Wrote the paper: Y-LW, S-HF, S-CG, W-JW.
Funding This work was supported by the National Science Foundation of China (81001204 to Y-LW, 30872958 to Q-HJ and 81270120 to J-CW) and grants from the Ministry of Health (201002007 to LJ) and the Ministry of Science and Technology (2011BAI09B00 to LJ).
Competing interests None.
Patient consent Obtained.
Ethics approval Ethical Committee of the Cancer Hospital, Fudan University.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The data presented in the manuscript are available on request.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.