Statistics from Altmetric.com
- COPD, chronic obstructive pulmonary disease
- FCV, forced vital capacity
- FEV1, forced expiratory volume in 1 second
- RFLP, restriction fragment length polymorphism
- SNPs, single nucleotide polymorphisms
- SSCP, single strand conformational polymorphism
Chronic obstructive pulmonary disease (COPD), including chronic bronchitis and emphysema, is a major cause of morbidity and mortality in many countries. The most important risk factor for COPD is cigarette smoking, and nearly 90% of COPD patients are smokers.1 However, only 15% of cigarette smokers develop clinically significant COPD.2 COPD is known to aggregate in families,3 and twin studies have shown that obstructive airway disease correlates directly with genetic similarities.4 In contrast, there are differences in the prevalence of COPD between different ethnic groups.5 All these results support the notion that genetic factors are involved in the pathogenesis of COPD. Recent segregation analysis has demonstrated that the genetic components determining forced expiratory volume in 1 second (FEV1) consist of multiple genes, each of which has a small influence, rather than a single Mendelian gene.6 An intensive search has been ongoing to find the genetic factors responsible for COPD development. Until now, more than 20 polymorphisms of candidate genes have been reported to have an association with COPD. They include genes for proteinases,7 anti-proteinases,8–10 anti-oxidants,11 xenobiotic metabolising enzymes,12,13 and inflammatory mediators.14,15 For most of these loci, however, there have been contradictory results.16 Although genetic heterogeneity among different ethnic groups under different environmental conditions could explain this inconsistency, it is still necessary to confirm the associations of the polymorphisms in different populations.
Recently the human CLCA1 gene and its murine counterpart, mCLCA3, have been isolated.17,18 To date, the human CLCA family consists of four homologous genes (hCLCA1, hCLCA2, hCLCA3, and hCLCA4), all clustered on the short arm of chromosome 1 (1p 22–31).19 The region around 120 cM on chromosome 1 has been identified to show moderate linkage to FEV1/forced vital capacity (FVC) in two independent genome scans for COPD.20,21 In addition, the expression of hCLCA1 and mCLCA3 is strongly induced in the airway epithelium, especially in goblet cells, under asthmatic conditions, while they are not expressed in the normal lung.22–25 Furthermore, in vitro transfection studies have demonstrated that both genes play a direct role in mucous production.23,25 Mucous hypersecretion is a pathophysiological feature of both COPD and bronchial asthma. Therefore, we hypothesised that polymorphisms of the CLCA1 gene would be related to the susceptibility to COPD.
In this study, we conducted a case–control study to assess the association of CLCA1 gene polymorphisms with COPD in Japanese and Egyptian populations.
The investigation is a case–control association study in two different ethnic groups, Japanese and Egyptian. The basic inclusion criterion was chronic heavy smoking. Clinically based ascertainment of subjects was used, based on past history, physical examination, and spirometric data (FEV1/FVC ratio of <70%, according to the Global Initiative for Chronic Obstructive Lung Disease criteria26). The predicted values of pulmonary function were determined according to each subject’s data (gender, age, weight, height, and ethnicity), using a ChestGraph Jr HI-101 spirometer (Chest, Tokyo, Japan) for Japanese subjects and a MultiSpiro-SX Spirometer (MultiSpiro Inc, CA, USA) for Egyptian subjects.27 Controls were age matched healthy smokers. For both groups, those with borderline pulmonary function, with low Brinkman’s index (the number of cigarettes/day × the number of years), or with other significant respiratory diseases such as bronchial asthma, bronchiectasis, and pulmonary tuberculosis were excluded. Furthermore, ethnic and geographic matching was considered to eliminate possible effect of population stratification. The Japanese subjects (88 COPD patients and 40 controls) were recruited from Tsukuba University Hospital. Most of the Japanese subjects were the same as those in a previous study.10 The Egyptian subjects (106 COPD patients and 72 controls) were recruited from the Department of Chest Diseases and Tuberculosis at Cairo University Hospital and affiliated hospitals. Written informed consent was obtained from all the subjects, and the study was approved by the ethics committees of the hospitals involved.
CLCA1 has been shown to regulate airway mucous production in inflammatory conditions. We hypothesised that polymorphisms of the CLCA1 gene could have an important role in the pathogenesis of chronic obstructive pulmonary disease (COPD). We sought to identify the CLCA1 gene polymorphisms and performed a case–control association study to evaluate the involvement of these variants in COPD in Japanese and Egyptian populations.
We identified 22 novel single nucleotide polymorphisms. There was a significant difference in +5080 T/C genotypes between the patient and the control groups in the Egyptian population (p = 0.024). In the Japanese population, the distribution of +13924 T/A allele frequencies was significantly different between the patient and the control groups (p = 0.042). However, becuse multiple comparisons were made, these associations may represent type 1 error.
In the Japanese population, the frequency of the haplotype +126 T: +13924 T: +25133 C: +31384 C was significantly higher in the COPD patients than in the controls (pcorr = 0.0002). In contrast, the frequencies of the haplotype +126 T: +13924 A: +25133 C: +31384 C and the haplotype +126 G: +13924 T: +25133 C: +31384 T were significantly higher in the controls than in the COPD patients (pcorr = 0.0017 and 0.0001, respectively) in the same population.
These polymorphisms and haplotypes may be involved in the pathogenesis of COPD and may be useful for predicting the susceptibility to COPD.
Screening the CLCA1 gene for polymorphisms
The structure of CLCA1 has been reported previously (Genbank accession no. AF039401).17 We studied the fifteen exons with intronic junctions and the promoter region, 1617 bp upstream from the transcription start site. Twenty three pairs of primers were synthesised to make PCR DNA fragments covering these regions (table 1). The nucleotide positions in this study are given relative to the transcription start site. Genomic DNA was extracted from whole blood using a Qiagen DNA blood kit (Qiagen, Hilden, Germany). For 20 patients and 20 controls in each ethnic group, we screened all the PCR fragments for polymorphisms by single strand conformational polymorphism (SSCP) as previously described.10 Following this, for samples showing variant band patterns, DNA sequencing was performed to locate the polymorphisms and design a method of genotyping them in the whole populations. Even when none of the screened samples had shown variant band pattern in SSCP analysis, we still performed DNA sequencing for eight samples randomly chosen from each ethnic group, to reduce the possibility of missing a common polymorphism that could not be detected by SSCP and to confirm the sequence of that portion by comparing it with the reference sequence.
It has been reported that SSCP has an average sensitivity of only 85–95% in detecting polymorphisms.28 Therefore, when a polymorphism was detected in the screened subjects, we also confirmed its incidence in those subjects by using restriction fragment length polymorphism (RFLP), TaqMan allelic discrimination,29 or direct sequencing. As rare polymorphisms are less likely to play a role in disease pathogenesis and need a very large number of subjects to show enough power, we genotyped all the subjects only for polymorphisms showing allele frequencies of >10% in the initial screening step.
22 SNPs found in this study, 5 were determined by RFLP, 8 by TaqMan allelic discrimination and 9 by direct sequencing (fig 1). The restriction enzymes used for RFLP were EarI for −1321 T/A, AciI for +31384 T/C (both New England Biolabs, Beverly, MA, USA), MboI for −489 C/T, MvaI for +126 G/T (both Takara, Shiga, Japan), and HaeIII for −258 C/T (Toyobo, Osaka, Japan). The same primers used for PCR-SSCP analysis were used to amplify the corresponding fragments for RFLP. The digested DNA fragments were resolved using 3% agarose gel containing ethidium bromide. The SNPs found at −1332 C/T, −499 A/T, −77 G/A, +5080 T/C, +13924 T/A, +20730 A/G, +20758 C/T, and +25133 C/T were analysed using TaqMan allelic discrimination. For each SNP, a pair of primers flanking the SNP and a pair of oligonucleotide probes, one homologous to the wild type labelled with the TaqMan FAM probe, and another homologous to the mutant type labelled with the VIC probe, were designed and synthesised by Applied Biosystems (Foster City, CA, USA) (table 2). The PCR was carried out on 20 ng of genomic DNA in a 25 μl reaction containing 50 to 900 nmol/l of each forward and reverse primer, 50 to 200 nmol/l of each FAM and VIC probe, and 1× TaqMan Universal PCR Master Mix (Applied Biosystems). PCR cycling conditions in the ABI PRISM 7000 (Applied Biosystems) were as follows: 50°C for 2 minutes and 95°C for 10 minutes, followed by 40 cycles of 95°C for 15 seconds and 60°C for 1 minute. The allelic discrimination was determined by the fluorescence intensity of FAM and VIC.
Two sided Student’s t test was used for checking significant differences in clinical data between the COPD patients and the control subjects in each ethnic group, with significance set at p<0.01. Hardy-Weinberg equilibrium was assessed using a goodness of fit χ2 test for biallelic markers. The distribution of genotype and allele frequency was analysed with Fisher’s exact test (2×3 and 2×2 tables), with significance set at p<0.05. Correction for multiple comparisons was performed by the Bonferroni method. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated to quantitatively assess the degree of association observed. The haplotype frequencies were estimated for the patients and the controls separately by the expectation–maximisation method using the SNPAlyze program (Dynacom, Mobara, Japan). Linkage disequilibrium between each pair of SNP loci was analysed using all the subjects including the patients and the controls with a likelihood ratio test. Lewontin’s disequilibrium coefficient D′30 was estimated from the haplotype frequencies. The distribution of haplotype frequency between the patient and control groups was analysed with Fisher’s exact test.
Table 3 demonstrates the characteristics of subjects recruited in this study. All the subjects were heavy smokers. Age and Brinkman’s index were not significantly different between patients and controls. All the patients in both ethnic groups had moderate to severe COPD.
Twenty two polymorphisms were identified as SNPs by PCR-SSCP analysis and DNA sequencing. All were novel; we registered them in the dbSNP database of the NCBI under accession numbers ss5607363 to ss5607365, ss6313704 to ss6313719, and ss7986610 to ss7986635. Eight of these 22 SNPs were found to be mutant homozygous in all the subjects examined in both Japanese and Egyptian populations compared with the reference sequence. They are: −1492 T/C (promoter), +582 G/A (exon 2), +5513 A/T (intron 4), +8082 A/G (exon 5), +27216 G/A (exon 13), +30389 T/G (exon 14), +30472 −/C (intron 14), and +31579 C/T (exon 15). Two, +8082 A/G and +30389 T/G, are non-synonymous nucleotide substitutions. Accordingly, it seems likely that at least in Japanese and Egyptian populations the reference amino acid translation of the gene protein should be changed from lysine to arginine at amino acid number 152 and from asparagine to lysine at number 760. The SNP +582 G/A in exon 2 is located in the 5′ untranslated region. The SNPs +27216 G/A in exon 13 and +31579 C/T in exon 15 were coding but synonymous variants. Therefore, these SNPs have no influence on the reference amino acid sequence.
The remaining 14 SNPs are shown in fig 1. Six were identified in the promoter: −1332 C/T, −1321 T/A, −499 A/T, −489 C/T, −258 C/T, and −77 G/A. Six were in the exons: +5080 T/C (exon 3; non-synonymous, Phe/Leu), +13924 T/A (exon 6; synonymous), +20730 A/G (exon 9; non-synonymous, Lys/Arg), +20758 C/T (exon 9; synonymous), +25133 C/T (exon 11; non-synonymous, Thr/Met) and +31384 T/C (exon 15; synonymous). Two were in the introns: +126 G/T (intron 1) and +5340 A/G (intron 3). Only four SNPs were found to have >10% incidence in each population during the initial screening step, therefore they were genotyped for the whole population (table 4). The genotypic frequencies for the control populations were consistent with Hardy-Weinberg equilibrium. Of these four SNPs, +5080 T/C in exon 3 tended to show a significant difference between the patient group and the control group in Egyptians only (p = 0.024, pcorr = 0.096). Furthermore, the allele frequency for T in Egyptians was significantly higher in the COPD group than in the control group (94% v 86%; p = 0.013, pcorr = 0.052; OR = 2.69; 95% CI 1.27 to 5.69). The +13924 T/A polymorphism was found in 48% of Japanese patients, but only 34% of the control group (p = 0.042, pcorr = 0.168; OR = 1.79; 95% CI 1.03 to 3.11). Therefore, it is possible that Japanese COPD patients tend to have T at the +13924 locus but because of multiple comparisons, the possibility of type 1 error cannot be excluded. In Egyptians, the distributions of +13924 T/A genotypes and alleles were not significantly different between the patients and the controls. No other significant differences were detected in the genotypic and allele frequencies of the remaining SNPs independently in either ethnic group.
The results of the pairwise linkage disequilibrium analysis for Japanese and Egyptians are summarised in tables 5 and 6. In Japanese, +13924 T/A was in strong linkage disequilibrium with +126 G/T (D′ = 0.300, χ2 = 18.92, p<0.001). The polymorphism +126 G/T was also in linkage disequilibrium with +25133 C/T and +31384 T/C. In Egyptians, +5080 T/C was in significant linkage disequilibrium with +126 G/T (D’ = 0.52, χ2 = 12.33, p<0.001) and +13924 T/A (D’ = 1.00, χ2 = 26.81, p<0.001).
Next, haplotype frequencies among these SNPs were estimated separately for the patient group and for the control group, then the distributions of these haplotype frequencies were tested for association with COPD using Fisher’s exact test (table 7). In the Japanese population, the frequency of the haplotype +126 T: +13924 T: +25133 C: +31384 C (haplotype 1) was significantly higher in the COPD patient group than in the control group. Twenty three percent of Japanese COPD patients displayed that haplotype, but only 2% of the controls (pcorr = 0.0002). In contrast, the haplotype +126 T: +13924 A: +25133 C: +31384 C (haplotype 5) and the haplotype +126 G: +13924 T: +25133 C: +31384 T (haplotype 6) showed a significant increase in frequency in Japanese controls compared with the COPD patients. The frequencies of haplotype 5 were 25% in controls v 7% in patients (pcorr = 0.0017) and the frequencies of haplotype 6 were 26% in controls v 6% in patients (pcorr = 0.0001). In Egyptians, there was no significant difference in the distribution of the haplotype frequencies between the COPD patients and the controls.
Our study was designed to search for genetic polymorphisms in CLCA1 and to perform a case–control association study in order to reveal loci involved in COPD susceptibility. It has been demonstrated that a case–control association study with a candidate gene can be a very powerful approach for identifying genetic causes of complex diseases such as COPD.31 Silverman et al proposed the major criteria to evaluate a case–control association study, including selection of candidate genes, population stratification, Hardy-Weinberg equilibrium and correction for multiple comparisons.32
From this study, it is reasonable to assume that CLCA1 is a candidate gene for COPD susceptibility. Although CLCA1 and its murine counterpart, mCLCA3, are virtually undetectable in the normal lung, they are strongly expressed in the bronchial epithelium in response to inflammatory stimulation.21–25 Furthermore, it has been demonstrated that CLCA1 regulates airway mucous production under inflammatory conditions.25 These findings suggest that a high level of continuous CLCA1 expression results in the overproduction of mucus and may be related to the pathogenesis of COPD, especially of chronic bronchitis. Conversely, as the mucous layer is a defensive barrier against environmental stimuli, the failure of mucus production under inflammatory conditions caused by mutations of the CLCA1 gene may result in destruction of the parenchyma, leading to emphysema.
As for population stratification, the genetic background of the Japanese population is considered to be homogeneous because Japan is a single racial nation. For the Egyptian population, ethnic and geographic matching were stressed in order to increase homogeneity. All the Egyptian subjects and their parents had to be born in the region of recruitment and to have grandparents who were born inside the country. The present study was designed to improve the power of the case–control approach and decrease the chance of ascertainment bias. By selecting only those individuals who had been sufficiently heavy smokers to develop COPD, we attempted to ensure that both the cases and the controls had a high exposure to the most important risk factor for developing COPD. All the patients and the controls in both ethnic groups were matched except for symptoms and pulmonary functions.
All the genotypic frequencies for the control populations in both ethnic groups were consistent with Hardy-Weinberg equilibrium; it is noteworthy that only the genotype data at +126 G/T in the Egyptian COPD group were out of equilibrium. Given the hypothesis that CLCA1 is involved in the pathogenesis of COPD, the genotypic frequency could be altered in the patient group. We ruled out the possibility of genotyping errors at this locus by using both DNA sequencing and RFLP. Although there are many pairwise linkage disequilibriums between the SNPs studied, these linkages are not complete. Therefore, we adopted the Bonferroni correction for multiple comparisons.
In this study, 22 polymorphisms in the CLCA1 gene were identified, eight of which were mutant homozygous in all the samples in both populations. Of the remaining 14 SNPs, only four were found to have more than 10% incidence in each population. When tested independently, the distributions of two SNPs, +5080 T/C and +13924 T/A, tended to show significant difference in the Egyptian and the Japanese groups, respectively. As +5080 T/C in exon 3 is a non-synonymous nucleotide substitution, this amino acid change may have an influence on the function of CLCA1 protein. Because the minor allele frequency of this SNP, +5080 C, is higher in the controls than in the patients, the nucleotide substitution T→C can be thought to play a protective role in susceptibility to development of COPD. The other SNP, +13924 T/A in exon 6, is synonymous. It is possible, however, that this nucleotide substitution exerts an influence on the function of CLCA1 protein by changing the rate of translation by modification of the mRNA stability and/or the ribosome binding. It has been demonstrated that codon usage can affect the general expression level of a heterologous gene, and that prevalent codons can result in a substantial increase in expression efficiency.33 It is also possible that one or more of these SNPs are in linkage disequilibrium with an original causal polymorphism in nearby CLCA genes that share structural and functional similarities with CLCA1.19
The result obtained by analysing only individual SNPs is insufficient. Haplotype analysis, testing associations using several polymorphisms, sometimes demonstrates genetic influences that are not detected by the analysis of single polymorphisms.34
In this study, the associations of the individual SNPs with COPD were different between the two populations. Additionally, there was an association between COPD and some haplotypes in the Japanese population; no such association was found for any haplotype in Egyptians. As the susceptibility to COPD is considered to be influenced by multiple genetic causes and genotype–environment interactions,35 it is also possible that different polymorphisms in different ethnic groups cause the same COPD phenotype. Another possibility is that the component of COPD was not the same between the Japanese and the Egyptians, although there was no population stratification in each ethnic group independently.
COPD has been recognised as a heterogeneous disease that includes chronic bronchitis and emphysema.36 In the present study, most of the Egyptian COPD patients presented with chronic productive cough and a diagnosis of chronic bronchitis was made for them. In contrast, more than half of the Japanese patients did not complain of a productive cough but of progressive dyspnoea on exertion. Plain chest radiographs and computed tomography images of most Japanese patients showed findings suggestive of emphysema such as hyperinflation, hyperlucency, and markedly low attenuation areas.37,38 It has also been demonstrated that the prevalence of COPD in smokers is lower in Japanese than in white populations5 and that mortality rates for COPD are higher in whites than in non-whites.39 Therefore, it seems likely that different ethnic populations have different components of COPD. This may be one of the reasons why prevalences of risk alleles of candidate genes for COPD differ greatly between different studies analysing different ethnic groups. The main drawback of our study is the relatively small sample size, which reduces the power of detection of true association, but even with a larger sample, the functional and biological impact of the described polymorphisms and haplotypes would require further study.
In conclusion, we have performed for the first time a comprehensive investigation of CLCA1 gene polymorphisms. We have identified 22 SNPs and characterised the associations between these SNPs and susceptibility to COPD. The SNPs and the haplotypes showing significant differences may be useful for predicting COPD susceptibility, thus preventing the progression of the disease by early intervention. Further functional characterisation of these SNPs and haplotypes is required to clarify the significance of CLCA1 in the pathogenesis of COPD. These results would also be useful for further evaluation of the CLCA1 gene in other complex respiratory diseases such as bronchial asthma.
The authors thank all participants in the study, especially staff of the participating hospitals who helped in patient recruitment and sample collection.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.