Article Text
Abstract
Background Leucocyte telomere length (LTL), which is fashioned by multiple genes, has been linked to a host of human diseases, including sporadic melanoma. A number of genes associated with LTL have already been identified through genome-wide association studies. The main aim of this study was to establish whether DCAF4 (DDB1 and CUL4-associated factor 4) is associated with LTL. In addition, using ingenuity pathway analysis (IPA), we examined whether LTL-associated genes in the general population might partially explain the inherently longer LTL in patients with sporadic melanoma, the risk for which is increased with ultraviolet radiation (UVR).
Results Genome-wide association (GWA) meta-analysis and de novo genotyping of 20 022 individuals revealed a novel association (p=6.4×10−10) between LTL and rs2535913, which lies within DCAF4. Notably, eQTL analysis showed that rs2535913 is associated with decline in DCAF4 expressions in both lymphoblastoid cells and sun-exposed skin (p=4.1×10−3 and 2×10−3, respectively). Moreover, IPA revealed that LTL-associated genes, derived from GWA meta-analysis (N=9190), are over-represented among genes engaged in melanoma pathways. Meeting increasingly stringent p value thresholds (p<0.05, <0.01, <0.005, <0.001) in the LTL-GWA meta-analysis, these genes were jointly over-represented for melanoma at p values ranging from 1.97×10−169 to 3.42×10−24.
Conclusions We uncovered a new locus associated with LTL in the general population. We also provided preliminary findings that suggest a link of LTL through genetic mechanisms with UVR and melanoma in the general population.
- Complex traits
- Telomere
- cancer: skin
- melanoma
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/
Statistics from Altmetric.com
Introduction
As expressed in leucocytes, telomere length (TL) is a polygenic trait with heritability estimated at 65%.1 Genome-wide association (GWA) meta-analyses identified nine loci associated with leucocyte TL (LTL).2–4 Of these, at least six harbour genes (TERC, TERT, NAF1, OBFC1, CTC1 and RTEL1) directly related to telomere homeostasis. LTL, which reflects TL in other somatic cells,5 is associated with a host of disease. Typically, LTL is short in patients with cardiovascular disease, principally atherosclerosis,6 and long in patients with lung adenoma,7 ,8 breast cancer,9 ,10 pancreatic cancer11 and sporadic melanoma.12 ,13 Moreover, highly penetrant germline mutations in the telomere maintenance genes POT1 and TERT have been recently shown in patients with familial melanoma.14–16
In our previous LTL meta-GWA, we found that a single-nucleotide polymorphism (SNP) (rs2535913) lies within the gene encoding the DDB1 and CUL4-associated factor 4 (DCAF4) had a barely suggestive association with LTL.4 In the present study, we present further analysis of rs2535913 through de novo genotyping and in silico look up. As DDB1 and CUL4 are engaged in the response to ultraviolet radiation (UVR)17 ,18 and given that the risk for sporadic melanoma is increased with an inherently longer LTL12 ,13 and UVR,19–21 we also examined whether rs2535913 is associated with altered expression of DCAF4 in lymphoblastoid cells and sun-exposed skin and whether LTL-associated genes, in general, might also be associated with genes engaged in melanoma pathways.
Materials and methods
Cohorts
A detailed description of demographic characteristics of all cohorts included in this study can be found in online supplementary table S1. Additional details related to the discovery and the replication cohorts can be found elsewhere.2 ,4
In brief, for the discovery data, rs2535913 was extracted from the summary results of a large GWA consortium meta-analysis including six cohorts (the Framingham Heart Study, the Family Heart Study, the Cardiovascular Health Study, the Bogalusa Heart Study, the Hypertension Genetic Epidemiology Network Study and TwinsUK). All the cohorts adjusted for the same covariates (age, age2, sex, smoking history).4 All the samples included in the meta-analysis were of European descent (evidence of non-European ancestry was assessed by principal component analysis comparison with HapMap in each cohort).
LTL measurement was performed by Southern blot analysis of the mean length (expressed in kilobases) of the terminal restriction fragments, generated by the restriction enzymes HinfI and RsaI after verification of DNA integrity.22 The de novo genotyping of rs2535913 was conducted on 3037 samples from different cohorts (Israeli Jews from the Jerusalem LRC Longitudinal Study23 and Palestinians from the Palestinian-Israeli Jerusalem Risk Factor Study24 , Frenchmen from the ADELAHYDE—Nancy study and ERA—France study,25 and Danes from a population sample of Danish twins).26–31 LTL for these individuals was measured by Southern blots, as in the discovery data set. In the second phase of replication, we performed an in silico look-up of results of LTL-GWAS based on 7795 European descent individuals from four independent cohorts (British Heart Foundation Family Heart Study, Queensland Institute of Medical Research, Brisbane Adolescent Twin Study, United Kingdom Blood Service and an independent sample set from TwinsUK).2 Mean LTL of these samples was measured using a qPCR-based technique (ref. 2 and expressed as a ratio of telomere repeat length (T) to a copy number of a single copy gene (S)). A calibrator sample or a standard curve was used to standardise T/S results across plates. LTLs in each cohort were standardised using a Z-score transformation. All the cohorts were also adjusted for age and sex in the main analysis.
LTL meta-analysis
The meta-analysis for the discovery stages was carried out previously using METAL.32 We used GWAMA (V.2.1)33 to test the meta-analyses of all the cohorts in which LTL was measured (either by Southern blots or qPCR). To this end, we used the inverse variance weighted method to combine the cohort-specific β-estimates. Because the meta-analysis of the entire data set included samples in which LTL was assessed using two different methods, we used the random-effect inverse variance method implemented in GWAMA. In addition, to test the presence and measure the amount of between-study heterogeneity, we used two different metrics: Cochran's Q statistic34 and I2.35
Expression analysis
We used the genome-wide expression data from the lymphoblastoid cell lines (LCLs) from the Multiple Tissue Human Expression Resource (MuTHER).36 The expression values were derived from a subset of twins from TwinsUK, which were also included in the association analysis. The analysis was performed on rs2535913 and the expression levels of DCAF4 using MERLIN,37 taking into account the family structure. For this analysis, the significance was defined as p<0.05, as only one independent test was performed.
To validate our results in a different data set and to perform tissue-specific (sun-exposed skin from lower leg) expression analysis, we used data from the Genotype-Tissue Expression (GTEx) project online portal (http://www.gtexportal.org).38
Ingenuity pathway analysis
The meta-GWAS data set included in the ingenuity pathway analysis (IPA) Core analysis consisted of 99 773 SNPs that met quality control and had a p value threshold of <0.05.
All SNPs mapping in coding region of a gene or within 2 kb upstream/0.5 kb downstream of it were annotated. Using these criteria, IPA successfully mapped 42 395 of the 99 773 initial SNPs to genes. We compiled and analysed four subsets of genes meeting increasingly stringent p value thresholds in the meta-GWA of LTL (p<0.05: 7362 genes; p<0.01: 2846 genes; p<0.005: 1771 genes; and p<0.001: 526 genes).
We then compared the LTL-associated genes included in the generated subsets with the genes reported by the IPA Global Canonical Pathway (database accessed on September 2014; http://www.ingenuity.com) for melanoma. We finally generated a p value using a 2×2 Fisher's exact test comparing the disease vs non-disease status in the LTL-associated genes and in the reference data set (see online supplementary figure S1). All p values were corrected for multiple testing using the Benjamini–Hochberg method.39
Results and discussion
Meta-GWAS findings in European descendants have already indicated significant associations of LTL with SNPs mapped to loci of TERC, OBFC1 and CTC1—key genes engaged in TL regulation.2–4 Of the nine SNPs that displayed suggestive associations with LTL (5×10−7<p<5×10−8) in our previous meta-GWAS in 9190 European descendants,4 seven were mapped to these three gene loci, while the remaining two (rs2535913, p=2×10−7; rs2806040, p=2.61×10−7) lie within DCAF4. Because both DCAF4 variants were in perfect linkage disequilibrium (r2=1) (see online supplementary table S2), we focused our attention on rs2535913 and further examined its association with LTL in eight additional cohorts.
We first de novo genotyped 3037 samples from four independent cohorts in which LTL was measured by Southern blots. We found a borderline significant association (possibly due to the small sample size) of the rs2535913 minor allele (A) with short LTL (β=−0.0343; p=6.13×10−2; table 1).
The overall meta-analysis of the discovery and de novo genotyped cohorts showed a genome-wide significant p value of 2.31×10−8. We then performed an in silico look-up of results of LTL-GWAS based on 7795 individuals from four independent cohorts in which LTL was measured by qPCR.2 Results showed an association of the rs2535913 minor allele with shorter LTL (β=−0.0451; p=7.8×10−3) (table 1). The combined meta-analysis, based on 20 022 individuals from the 14 independent populations (6 in the discovery data set, 4 de novo genotyped and 4 in silico look-up populations), showed a significant association (β=−0.0493; p=6.4×10−10) of rs2535913 with LTL.
Due to the different characteristics of the studies included in the meta-analysis, we tested the between-study heterogeneity using two different metrics (Cochran's Q statistic and I2). Both methods did not detect between-study heterogeneity (I2=0%; Cochran’s Q p=0.46; table 1 and online supplementary figure S2).
Previous studies,2–4 as well as the present research, showcase the large samples required to decipher the genetics of LTL in the general population. Jointly, the few genes, including DCAF4, that have been found thus far to be associated at a genome-wide significance level with LTL explain <5% of the inter-individual variation in LTL.2
In order to identify potential causal alleles in the coding sequence, we looked for variants in tight linkage disequilibrium with rs2535913 (LD; r2>0.9 in 1000 Genomes Project European samples). We identified 15 SNPs (see online supplementary table S2) of which only one (rs2806034) was in the coding region causing a synonymous change (serine to serine) in all the different DCAF4 transcripts. We also performed a conditional analysis, including rs2535913 as a covariate, to identify potential independent secondary signals at this locus. This analysis did not find any significant evidence for an independent signal (see online supplementary figure S3).
Notably, rs2535913 is located within binding motifs for the chromatin organising factor CTCF and Rad21 (see online supplementary table S2).40 Rad21 is a component of the cohesin complex involved in DNA repair, apoptosis and chromosome cohesion during the cell cycle. CTCF and cohesin are both integral components of most human subtelomeric regions and have been implicated in telomere maintenance.41
To explore the potentially functional impact of this intronic SNP on DCAF4, we used genome-wide expression data from MuTHER36 (http://www.muther.ac.uk/) based on 778 unselected European descendant twins. We first focused our analysis on LCLs. We found that the minor allele (A) of rs2535913 was associated with lower expression of DCAF4 (β=-0.039, p=4.1×10−3) (figure 1A). To validate our results in an independent data set, we used data from the GTEx project online portal (http://www.gtexportal.org).38 We found a significant association (p=3×10−2) between rs2535913 and DCAF4 expression levels measured in whole blood of 167 individuals (figure 1B).
DCAF4 interacts with DDB1 and CUL4.18 This interaction suggests that DCAF4 may be involved in UVR response since DDB1 and DDB2 serve as key detectors of UVR-induced DNA damage and transcription-coupled repair pathways,17 ,18 while cullins are engaged in ubiquitin-dependent protein degradation.42 We, therefore, examined rs2535913-DCAF4 expression association specifically in sun-exposed skin (lower leg) tissue included in the GTEx database. Notably, despite the small sample size (n=113), we observed a significant association (p=2×10−3) between rs2535913 minor allele and lower DCAF4 expression levels in sun-exposed skin (figure 1C). Thus, the finding of the LTL-DCAF4 link might be important not only because it expands the repertoire of common SNPs associated with LTL at genome-wide significant level, but also because it may provide a possible link between TL, as expressed in leucocytes, and UVR.
DDB1 modulates the transcription factor E2F1, which, in turn, regulates cell proliferation and telomerase.43–45 Thus, cell replication and telomere dynamics might be linked to pathways engaged in UVR-induced DNA damage repair.46 In this context, the association of LTL with DCAF4 might be mediated by telomerase, perhaps via E2F1.43 ,44 ,45 That is because mutations indicative of UVR damage in the promoter region of TERT are common in melanoma tumours (but very rare in the germline) and apparently generate consensus DNA binding sites, which are targets not only of ETS transcription factors47 that direct cytoplasmic signals to control gene expression but also E2F1.48
We have thus established that DCAF4 is an LTL-associated gene and that rs2535913 minor allele is associated with decreased DCAF4 expression in blood and sun-exposed skin, which may suggest DCAF4 involvement in UVR response. Given that LTL is inherently long in patients with melanoma12 ,13 and the increased risk for this cancer with UVR exposure,19 ,20 ,21 we sought further links between LTL-associated genes and melanoma in the general population by testing a polygenic model using the results of our large LTL meta-GWA.4 This model does not require studying patients with melanoma and is based on the following premise: <10% of melanoma cases are familial.49 However, based on research in twins, the heritability of sporadic melanoma is approximately 55%.50 It follows that while familial melanoma is caused by highly penetrant and rare germline mutations, sporadic melanoma might result from the additive effect of common genetic variants in the general population, each of which causally contributes a low risk for the disease.51 A proof of concept for this premise comes from a recent study in 11 108 melanoma patients and 13 933 controls.52 The study developed a genetic risk score for melanoma based on seven LTL-associated genes, which had been derived from LTL-GWA studies in the general population.2–4 Similarly, our polygenic approach has been to take advantage of IPA Core analysis to decipher connections of LTL-associated genes in the general population with genes that had been reported to be engaged in melanoma pathways. To this end, we analysed sets of LTL-associated SNPs that met increasingly stringent p value thresholds in our LTL meta-GWA , as described under 'Materials and methods' (see online supplementary table S3). We then compared the number of LTL-associated genes identified by IPA with the total number of genes related to melanoma in the IPA reference set (see online supplementary figure S1). We found that genes included in the melanoma pathway were over-represented with p values ranging from 1.97×10−169 (LTL-associated SNPs p value <0.05) to 3.42×10−24 (LTL-associated SNPs p value <0.001) in each LTL-associated gene subset (table 2 and see online supplementary figure S1).
LTL-associated genes were also enriched for genes included in pathways of different types of cancers, although at much less significance than that found for melanoma pathways. We observed, for example, that genes included in the colorectal cancer pathway were over-represented among LTL-associated genes with p values ranging from 8.57×10−62 (LTL-associated SNPs p value <0.05) to 1.28×10−6 (LTL-associated SNPs p value <0.001).
While findings of a longer LTL in sporadic melanoma are fairly consistent across studies, until recently, no consistency had emerged from studies of the relationship between LTL and other cancers.53 ,54 This might be because studies that examined the association of LTL with cancer largely used leucocyte DNA from patients that had already been subjected to chemotherapy, irradiation or both, which probably impacted haematopoiesis and consequently LTL.53 ,54 Moreover, sample sizes of the majority of these studies were often too small, which limited the ability to detect significant effects, particularly when qPCR was used to measure telomere DNA content (due to the large measurement error of this method).55 That being said, recent large-scale prospective studies, which include pooled data, now show that inherently long LTL is associated with lung adenoma, as well as the cancers of breast and pancreas.7–11 Although hardly applying to all cancers, these findings suggest that an inherently longer LTL might not be unique to patients with melanoma. Cancer is not a single disease, and its causes are multifactorial. Thus, the potential role of telomere biology in carcinogenesis must be contextualised with specific circumstances that depend on the type of cancer, its anatomical location, the age and sex of the individual and his/her overall genetic makeup with respect to a host of heritable and environmental risks.
In conclusion, the core findings of this work are (a) DCAF4 is a novel gene that contributes to LTL variation in humans and (b) its expression levels are altered in blood and sun-exposed skin; the latter may suggest a potential role in UVR response. We also provide preliminary evidence that genes associated with LTL are enriched among genes engaged in melanoma pathways in the general population. Our model might be useful in testing the role of LTL genetics in other human cancers.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Footnotes
Correction notice The license of this article has changed since publication to CC BY 4.0.
Twitter Id Twitter Follow Massimo Mangino at Maxmangino
Contributors AA, MM, RS, SCH and DTAE conceived and designed the experiments. MM, LC,RS,SCH, DTAE, MK, IP, UH, VC, DRN, SJH and KH analysed and interpreted the data. JDK, APR, AB, RS, KC, HN, DL, ALF, WC, GSB, NJS, NGM, ST, NJS, KOK and CD contributed samples/materials/analysis tool. MM, DTAE, RS and AA wrote the paper. MM, DTAE, RS, VB, TDS and AA reviewed/edited manuscript. All authors approved final manuscript.
Funding TwinsUK. The study was funded by the Wellcome Trust; European Community's Seventh Framework Programme (FP7/2007-2013). The study also receives support from the National Institute for Health Research (NIHR) BioResource Clinical Research Facility and Biomedical Research Centre based at Guy's and St Thomas’ NHS Foundation Trust and King’s College London. TDS is holder of an ERC Advanced Principal Investigator award. SNP Genotyping was performed by The Wellcome Trust Sanger Institute and National Eye Institute via NIH/CIDR. The Bogalusa Heart Study. This study was supported by grants 5R01ES021724 from National Institute of Environmental Health Science, and 2R01AG016592 from the National Institute on Aging. The Framingham Heart Study. Supported by NIH contract N01-HC-25195. This project was supported in part by intramural funding from the National Heart, Lung, and Blood Institute and the Center for Population Studies of the NHLBI. CHS. Cardiovascular Health Study: This CHS research was supported by NHLBI contracts HHSN268201200036C, HHSN268200800007C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086; and NHLBI grants HL080295, HL087652, HL105756 with additional contribution from the National Institute of Neurological Disorders and Stroke (NINDS). Additional support was provided through AG023629 from the National Institute on Aging (NIA). A full list of CHS investigators and institutions can be found at http://chs-nhlbi.org/. The Center of Human Development and Aging. AA grant support: NIH: Human Telomere Genetics R01AG20132; Telomeres & Vascular Aging, R01AG21593; Leukocyte Telomere Dynamics, Gender, Menopause, Insulin Resistance, R01AG030678. BHF-FHS. The British Heart Foundation Family Heart Study (BHF-FHS) was funded by the British Heart Foundation (BHF). Genotyping of the BHF-FHS study was undertaken as part of the WTCCC and funded by the Wellcome Trust. NJS holds a chair funded by the BHF and VC is funded by the BHF. VC and NJS are supported by the NIHR Leicester Cardiovascular Biomedical Research Unit and NJS holds an NIHR Senior Investigator award. The Jerusalem LRC Longitudinal Study The study was funded by the US-Israel Binational Science Foundation and the Israel Science Foundation. Palestinian-Israeli Jerusalem Risk Factor Study. The study was funded by the US-AID Program for Middle Eastern Regional Cooperation. ADELAHYDE—Nancy study and ERA—France study. The study received support from the French Fondation pour la Recherche Médicale (FRM DCV2007-0409250) and the Plan Pluriformation (PPF815 PSVT-2005). Special thanks to Ms Cynthia Thiriot (INSERM U961, Nancy France) for her contribution to the geotyping of French cohorts. Hypertension Genetic Epidemiology Network Study (HyperGEN). HyperGEN was supported by cooperative agreements HL54471, HL54472, HL54473, HL54495, HL54496, HL54509, HL54515 and grant HL055673. HyperGEN investigators and institutions can be found at http://www.biostat.wustl.edu/hypergen/hypergen.shtml. Queensland Institute of Medical Research (QIMR). We thank Marlene Grace, Ann Eldridge, and Kerrie McAloney for sample collection; Megan Campbell, Lisa Bowdler, Sara Smith, Steven Crooks, and staff of the Molecular Epidemiology Laboratory for sample processing and preparation; Harry Beeby, David Smyth, and Daniel Park for IT/database support; Scott Gordon for his substantial efforts involving the QC and preparation of the GWA data; and the twins and their families for their participation. We acknowledge support from the Australian Research Council (A7960034, A79906588, A79801419, DP0212016, and DP0343921). Telomere length assessment was co-funded by the European Community's Seventh Framework Programme (FP7/2007-2013), ENGAGE project, grant agreement HEALTH-F4-2007-201413 and National Health and Medical Research Council (NHMRC)-European Union Collaborative Research Grant 496739. GWM was supported by an NHMRC Fellowship (619667), DRN (FT0991022) and SEM (FT110100548) were supported by an ARC Future Fellowship. Genotyping was funded by the NHMRC (Medical Bioinformatics Genomics Proteomics Program, 389891). Genotype imputation was carried out on the Genetic Cluster Computer (http://www.geneticcluster.org), which is financially supported by the Netherlands Scientific Organisation (NWO 480-05-003). UKBS. The UK Blood Services collection of Common Controls (UKBS-CC collection) is supported by the Wellcome Trust grant 076113/C/04/Z and the collection was established as part of the Wellcome Trust Case Control Consortium. WHO is supported by a NIHR programme grant to NHSBT (RP-PG-0310-1002) and is a NIHR Senior Investigator.
Competing interests None.
Patient consent All participants provided written informed consent, including permission for genetic analysis and the use of DNA for the measurement of LTL.
Ethics approval All studies involved in this research received institutional review board approvals.
Provenance and peer review Not commissioned; externally peer reviewed.