Article Text

Download PDFPDF

Identification of candidate lung cancer susceptibility genes in mouse using oligonucleotide arrays
  1. W J Lemon1,*,
  2. H Bernert1,*,
  3. H Sun1,
  4. Y Wang2,
  5. M You1
  1. 1Division of Human Cancer Genetics, The Ohio State University Comprehensive Cancer Center, 420 West 12th Avenue, Columbus, Ohio 43210, USA
  2. 2School of Public Health, The Ohio State University Comprehensive Cancer Center, 420 West 12th Avenue, Columbus, Ohio 43210, USA
  1. Correspondence to:
 Dr M You, Manuel Tzagournis Medical Research Facility, Room 530, 420W West 12th Avenue, Columbus, Ohio 43210, USA;


We applied microarray gene expression profiling to lungs from mouse strains having variable susceptibility to lung tumour development as a means to identify, within known quantitative trait loci (QTLs), candidate genes responsible for susceptibility or resistance to lung cancer. At least eight chromosomal regions of mice have been mapped and verified to be linked with lung tumour susceptibility or resistance. In this study, high density oligonucleotide arrays were used to measure the relative expression levels of >36 000 genes and ESTs in lung tissues of A/J, BALB/cJ, SM/J, C3H/HeJ, and C57BL/6J mice. A number of differentially expressed genes were found in each of the lung cancer susceptibility QTLs. Bioinformatic analysis of the differentially expressed genes located within QTLs produced 28 susceptibility candidates and 22 resistance candidates. These candidates may be extremely helpful in the ultimate identification of the precise genes responsible for lung tumour susceptibility or resistance in mice and, through follow up, humans. Complete data sets are available at

  • microarray
  • linkage analysis
  • gene mapping

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Numerous chromosomal regions genetically linked with susceptibility or resistance to pulmonary adenomas have been described in mice using inbred strains showing widely different susceptibilities to formation of both spontaneous and chemical induced lung tumours.1–3 Susceptibility is intrinsic to the lung itself as shown by the classical experiments involving lung explants from sensitive and resistant mice.4,5 After carcinogen administration to F1 mice previously made host to these explants, only the lungs from the sensitive mouse strain developed tumours.4,5 Matings of sensitive A/J and resistant C57BL/6J mice produce F1 and F2 offspring, which are of intermediate sensitivity to tumour induction, thus implicating more than one gene and illustrating that tumour size and number are multigenic quantitative traits.6 Production of recombinant inbred (RI) lines of A/J (A) and C57BL/6J (B6) mice and subsequent analysis of their tumour sensitivities suggested that three genes, one major and two minor genes, were involved in determining the sensitivity to mouse lung tumour development.6 Subsequent linkage studies have identified pulmonary adenoma susceptibility (Pas) and pulmonary adenoma resistance (Par) loci. We thus adopt the definition of quantitative trait locus (QTL) as a known chromosomal region in which one or more genes are likely to underlie the linkage.

Listed in table 1 are the selected QTLs that have been mapped in various mouse crosses. A major susceptibility locus was mapped in (A/J × C3H/HeJ) F2 mice to distal chromosome 6 and was termed the Pas1 locus. This locus produced a maximum logarithm of the likelihood ratio (lod) score of 9 and accounted for approximately 45% of the observed phenotypic variance.7 A lod score of 3 or greater is considered significant for linkage. Consistent results were obtained in comprehensive linkage studies using (A/J × C57BL/6J) F2 (60% of variance), (A/J × C57BL/6J) × C57BL/6J (16% of variance), (A/J × M spretus) × C57BL/6J (34% of variance), and A × B & B × A RI mice (51% of variance).8–11 Three other loci were mapped to chromosomes 17, 19, and 9.8,9 Linkage to a locus on chromosome 17, the site of the putative Pas2 locus, was observed in (A/J × C57BL/6J) F2, accounting for 8% of the total variance in phenotype. The location of the Pas2 locus is homologous to human chromosome 6p21; potential candidates at this location are the genes for tumour necrosis factor α and β. Similarly, linkages to lung tumour susceptibility were also seen at markers on chromosome 19 (Pas3), accounting for 3% of the phenotypic variation in a study on (A/J × C57BL/6J) × C57BL/6J mice, and 2% of the explained phenotypic variation when (A/J × C57BL/6J) F2 mice were used. In this latter study, suggestive linkage to a locus on chromosome 9 (Pas4) was determined to explain 4% of the total phenotypic variance.8,9 Mouse-human synteny for all loci can be examined in detail using the Homology browser at the NCBI (

Table 1

Summary of Pas and Par QTLs. Pas QTLs have been largely studied using various genetic crosses of A/J and C57BL/6J; Par 1 and 3 using A/J and SM/J; Par 2 and 4 using A/J and BALB/cJ. Markers flanking the QTLs were taken larger than what some recent fine mapping studies have produced in order to allow identification of multiple genes

At present, four Par QTLs have been mapped using F2 or backcross populations of mice, including Parl (chromosome 11), Par2 (chromosome 18), Par3 (chromosome 12), and Par4 (chromosome 4).12 Par1 is a lung tumour resistance locus that was mapped in (A/J × M spretus) × C57BL/6J mice to the retinoic acid receptor-α (Rara) gene locus on chromosome 11.13,14 Contributed by the M spretus allele, Par1 gave a maximum lod score of 5.3 accounting for 23% of phenotypic variance when coexpressed with the highly penetrant Pas1 allele of the A/J strain. In mice carrying the M spretus instead of the A/J allele of the Pas1 gene, the resistant effect of Par1 on tumour incidence, multiplicity, and volume was lessened by about a half. Par1 behaves like a modulator of Pas1, to some degree subduing the dominant effect of Pas1 on lung tumorigenesis. Par2 was mapped by linkage studies on (A/J × BALB/cByJ) × A/J and (A/JO1aHsd × BALB/cO1aHsd) F2 mice to chromosome 18 at microsatellite marker D18MIT103. A lod score of 12.2 was reported at this locus, with a phenotypic variance of 38% for resistance to tumour induction.15 This locus was termed Par2. In our own analysis of (A/JO1aHsd × BALB/cO1aHsd) F2 mice, Par2 had a significant linkage to lung tumour resistance and produced a maximum lod score of 11.16 The greatest linkage occurred at the site of the Dcc tumour repressor gene.16 Par3 was mapped to chromosome 12 with a lod score of 6.47, using backcross population between SMXA24 RI mice and A/J mice.14 Par3 seems to have a stronger resistance to lung tumour induction when coexpressed with the A/J allele of the Par2.17 Finally, Par4 or Papg1 was mapped to chromosome 4 (D4MIT77) (lod score = 3.0) using (A/JO1aHsd × BALB/cO1aHsd) F2 mice.16,18 Linkage on chromosome 4 was strongest at a marker recombinationally inseparable from the p16INK4a tumour suppressor gene locus, with the BALB/cJ allele at this locus associated with sensitivity to lung tumour formation.17

Identification of candidate Pas and Par genes responsible for the lung cancer susceptibility QTLs proves to be rather difficult. One obstacle is the fact that several hundred genes can be localised to a 20-30 cM QTL region. Fine mapping studies typically assume that one or very few genes are responsible for most of the effect attributed to the QTL. If several genes or a few genes separated by more than a few centimorgans are responsible, fine mapping may prove to be more challenging. Evaluation of differential gene expression and nucleotide polymorphism of such a large number of genes can be a significant challenge. In the present study, we combined the Celera mouse genome sequence and lung cancer genetics with microarray profiling to identify candidate genes. Specifically, we used high density oligonucleotide arrays to detect differential gene expression in lung cancer susceptibility QTL regions for the identification of candidate genes responsible for lung cancer susceptibility or resistance in several relevant mouse strains.



Markers flanking each QTL were located in the Celera (Rockville, MD) mouse genome database (, 6 March and 13 March, 2002 data releases) and transcript information downloaded. Transcripts were matched with Affymetrix probe sets as described below. RNA from selected mouse strains were profiled and the transcripts within each QTL were evaluated for differential expression. Candidacy for differentially expressed transcripts was determined by comparing profiles with published reports as described below.


Four to six week old mice, one from each of five mouse strains including A/J, SM/J (S), BALB/cJ (Bc), C3H/HeJ, and C57BL/6J (B6) were obtained from The Jackson Laboratory (Bar Harbor, ME). Animals were euthanised one week after arrival. Lungs from these mice were harvested and frozen in liquid nitrogen until RNA analysis.

RNA isolation

Total RNA from lungs of one mouse from each strain was isolated using TRI reagent (Molecular Research Center, Cincinnati, OH). The tissue was frozen in liquid nitrogen, pulverised, then homogenised in 1 ml of TRI reagent, incubated for five minutes at room temperature, followed by addition of 200 μl chloroform, vigorous mixing, and incubation on ice for 15 minutes. The sample was centrifuged at 14 000 rpm for 20 minutes; the aqueous phase was transferred to a fresh tube with an equal volume of isopropanol, and incubated on ice for 30 minutes. After centrifugation at 14 000 rpm for 15 minutes, the RNA pellet was washed in 75% ethanol and dissolved in Rnase free water. The quality of RNA was confirmed on a formaldehyde agarose gel, and the concentration was determined by reading absorbance at 260/280 nm.

Microarray analysis

RNA samples were further purified, labelled, and processed by our microarray core facility according to standard manufacturers’ protocols ( Singleton cRNA preparations were produced from 30 μg of total RNA from each mouse and 10 μg equivalent aliquots were hybridised to each Affymetrix mouse oligonucleotide array (Santa Clara, CA): Mu74Av2 (A array), Mu74Bv2 (B array), and Mu74Cv2 (C array). Arrays were then scanned and digitised.

Mapping Affymetrix probe sets to Celera transcript sequences

Transcript IDs, annotations, and transcript sequences for all genes between flanking markers were downloaded from the Celera database. To map Celera transcripts to Affymetrix probe sets, BlastN ( was used to compare Celera transcript sequences with the Affymetrix “consensus” sequences ( for all probe sets of the A, B, and C arrays. An Affymetrix probe set was said to measure a particular Celera transcript under the following conditions: (1) the sum total coverage of blast hits (E<10-4) between an Affymetrix consensus and a Celera transcript included (A) at least 50% of both sequences or (B) 70% of one sequence and, in either case, the longer sequence was no more than 1.5 times as long as the shorter; (2) the probe set met condition 1 for no more than one Celera transcript. This blast method was selected over an annotation matching method since mappings produced with the blast method can be managed by a direct objective scoring scheme, while annotation matching has issues often requiring ad hoc resolution such as annotation differences, redundant accession numbers, etc. For each mapped probe set, the annotation presented here comes from the associated Celera transcript.

Estimates of gene expression

Li-Wong full model estimates (LWF) of gene expression19,20 were produced for A, B, and C arrays for mRNA samples. To do this, fluorescence intensity data within CEL files were scaled (normalised) by quadratically regressing log intensities for each array against the log of the median spot intensities (log(i) = β0 + β1 log (median) + β2 log(median)2). Median spot intensities were produced for each spot on an array type producing what can be thought of as a pseudo-median array. A arrays were scaled as one group, B arrays another, and C arrays a third group. Estimates of gene expression were produced from the scaled intensity data using a C program available on our web site.

Differential expression and visualisation of gene expression

Expression ratios and a Z scoring method were used to assess differential expression.21 Ratios of expression between relevant strains for genes within the QTLs were computed. For Pas1-4, A/J and B6 ratios were determined, for Par1 and 3 A/J and S ratios, and for Par2 and 4 A/J and Bc ratios. When more than one probe set met the conditions for association with a particular Celera transcript, the one displaying the greatest fold change between the strains was selected to represent the transcript. This selection provided a means to reduce the false negative rate, which is crucial when identifying candidates.22 To produce colour images, gene expression values were unit normalised ((x–x̅)/sx) across samples of all strains. For Pas 1-4, the A/J and B6 values were retained and displayed via linear colour gradient with cyan indicating values below the mean, red indicating above the mean, and neutral grey near the mean. For Par1 and 3, A/J and S unit normalised values are displayed and for Par2 and 4, A/J and Bc unit normalised values are shown.

Follow up of observed candidates, including validation by RT-PCR or northern blot, is under way, but because of the large number of candidates, it is beyond the scope of this work. Other results from our laboratory and from others indicate good correlation between microarray results and RT-PCR and northern analysis.23,24

Estimation of significance

In an approach similar to the “modified t” approach used by Eaves et al22 to analyse the Idd loci for diabetes susceptibility genes, a Z scoring method was used to produce a statistic for comparing two strains within the set of five, and a liberal cut off value was selected to reduce the false negative rate. The Z transformation follows,

Embedded Imagewhere g1 and are g2 the gene expression values for the two strains being compared and σ2g1 and σ2g2 are variances associated with those gene expression values22 (F Wright, personal communication). Variances were estimated by producing a model of variance from all the gene expression data from all arrays and all samples. A linear model fitting log(variance across strains) to log(mean across strains) from each gene resulted in the model.20,22 For each expression value, the corresponding variance can be computed and with each pair of expression values, a Z score. The cut off for display was selected as |Z| > 1.

Association and concordance with published reports

To identify candidate genes from the several within each QTL passing the Z cut off, annotation was used to search published reports for association of aberrant expression of the gene with cancer. If the observed pattern of expression in the strains was consistent with that of the reported association with cancer, the gene was identified as a candidate. For example, if a gene were a tumour suppressor and array expression showed an increase in the resistant strain, the gene was identified as a candidate for conferring inherent resistance. Note that the nomenclature of Pas or Par does not play a role in this, only the gene’s activity does. A tumour suppressor within a Pas locus is a candidate if its expression was higher in the resistant strain. Similarly, an oncogene within a Par locus is a candidate if its expression was higher in the susceptible strain.


Our approach to identifying candidate lung tumour susceptibility and resistance genes builds on previous genetic work, availability of the mouse genome sequence, and genome wide expression profiling. The genome sequence, combined with previously identified QTLs, enabled us to focus our gene expression analysis on transcripts within the regions known to modulate susceptibility and resistance. This reduced the burden of identifying and filtering spurious associations which typically encumber microarray analyses.

The eight previously mapped QTLs are summarised in table 1 and shown pictorially in fig 1 using physical position of flanking markers in the Celera mouse genome. Susceptibility loci (Pas1-4) are located respectively on chromosomes 6, 17, 19, and 9. Resistance loci (Par1-4) are located respectively on chromosomes 11, 18, 12, and 4. The columns contain first the loci, followed by the strains involved in original mapping, breeding method used during mapping, approximate phenotypic variance explained by alleles of the crossed strains, proximal flanking markers, their genetic positions in MGD (, and physical positions in the Celera database. Next appear the distal flanking markers and their positions. The final two columns contain the number of transcripts found in the Celera genome between the markers and the number of those transcripts found to be uniquely represented on the microarrays.

Figure 1

Depiction of the QTLs. Chromosomes are drawn according to physical distance shown on the left with genetic positions adjacent to each figure.

In total, 4819 transcripts were found within the eight QTLs in the Celera Discovery database and 1270 of these were found to be represented on one or more arrays. LWF estimates of gene expression were determined and fold change between relevant strains computed to assess differential expression.

Table 2 illustrates gene expression differences for transcripts within the Pas 1-4 QTLs in order of physical position. Pas QTLs depict comparison of expression between A/J and B6 (fig 2). Table 3 depicts expression of transcripts within the Par 1-4 QTLs also ordered by position. Par1 and 3 depict comparison between A/J and S and Par1 and 2 depict comparison between A/J and Bc (fig 3). Both tables 2 and 3 provide details of Li-Wong full model estimates of gene expression, statistics, and annotation derived from Celera.

Table 2

Differentially expressed genes within the Pas 1–4 loci. Since the relevant contrast for these loci is between A/J and C57BL/6J, data for these alone are presented.

Table 3

Similar to table 2, except for the relevant contrasts. Par 1 and 3 involve A/J and SM/J while Par 2 and 4 involve A/J and BALB/cJ. In all cases except Par 4, A/J is the susceptible strain

Figure 2

Coloured image of gene expression for transcripts in the Pas 1-4 QTLs from A/J and C57BL/6J. The leftmost column in each panel denotes the fold change from A/J to C57BL/6J, while the adjacent column denotes the fold change from C57BL/6J to A/J. Colour denotes the number of standard deviations that expression differs from the mean expression for all five strains. Green denotes below average expression, red denotes above average expression, and black denotes near average expression. An edited version of the annotation provided by Celera for each transcript appears to the right of each image.

Figure 3

Coloured image of gene expression for transcripts in the Par1 and 3 QTLs from A/J and SM/J in Par2 and 4 from A/J and BALB/cJ. Values and colours are the same as those described for figure 2.




In Pas1-4, A/J is the susceptible strain and B6 the resistant one. Tumour suppressor candidates will have shown higher expression in B6, while oncogene candidates will have shown higher expression in A/J. Pas1 is a major susceptibility locus based on the genetic linkage studies in several crosses. Because of its location, the K-ras gene became a candidate for the Pas1. However, K-ras had a Z score of 0.6 and thus did not make the list of candidates on the basis of expression. Many models implicating K-ras postulate a central role of mutation in tumorigenesis which may not be manifest in normal gene expression under the conditions here. A/J and B6 carry different alleles of K-ras, but differential expression was not detected in this experiment. Genes that did show differential expression by the stated criteria are: hes related protein, cyclophilin H, protein tyrosine phosphatase BK, matrix Gla protein, recQ, and ECA39. Hes related protein is a basic helix-loop-helix protein with a Notch binding site that is responsive to Notch activation.25,26 Notch4, a candidate in Pas2 below, has been shown to be a site of insertion for viral intercisternal A particles. Insertion can result in a constitutively active Notch4 which can ultimately drive transformation to a highly invasive phenotype.27 Cyclophilin H is a paralogue of Pin1, which has been shown to interfere directly with the association of beta-catenin and Apc resulting in (1) increases of beta-catenin, cyclin D1, c-Myc, and (2) a decrease in apoptosis as a consequence of ubiquitination of IkappaB.28 Pin1 has been shown to be up regulated in breast tumours. Protein tyrosine phosphatase BK is of the receptor type and does not react with serine residues.29 Its function is otherwise unknown, but the central role of phosphorylation in cell signalling indicates that it remains a candidate. Matrix gla is a cell adhesion molecule that has been shown to be down regulated in colorectal cancer, although its role in tumorigenesis is not known.30 RecQ is a DNA damage repair enzyme in which missense alleles in humans have been shown to alter the risk of lung cancer by two-fold.31 ECA39/Bcat1, branched chain amino acid aminotransferase, catabolyses branched chain amino acids and produces ketoacids which at high levels induce apoptosis.32 ECA39 is a c-myc target and has been implicated in c-myc regulated apoptosis.32


Pas2 QTL is located at the H-2 locus whose haplotypes correlate with the incidence and multiplicity of mouse lung tumour induction.8 The expected candidates are TNFα and β. Neither made the Z cutoff (TNFα Z=0.43, TNFβ Z=0.65). Among the oncogenic candidates passing the Z cutoff are Notch4, heterogeneous nuclear ribonucleoprotein K (protein K), and ENPP4. Candidate tumour suppressors are regulator of cullins 1 and cdc5-like. The several differentially expressed MHC genes are MHC psoriasis candidate protein, histocompatibility 2 M region locus 9, H2-K region expressed gene 6, histocompatibility 2 O region beta locus, and tapasin. Candidacy of these genes was determined as follows. Protein K can be bound by the Par2 candidate high mobility group protein 1, which can also bind p53 and another Par2 candidate methyl-CpG binding domain protein 2.33 Protein K has pluripotent function that links translation, cell cycle, and apoptosis. Protein K dependent repression of translation can be modulated through phosphorylation by ERK.34 Protein K is a target of JNK which has a role in apoptosis35 and protein K transcription can be induced by EGF.36 ENPP4 (Autotaxin), a myc target which induces metastasis in the co-presence of retinoic acid, is also a candidate.37 Regulator of cullins 1 is a member of the von Hippel-Lindau tumour suppressor complex which regulates, in part, ubiquitination of IκB and consequent activation of NFκB.38 The function of cdc5-like is not known, but by homology has a likely role in regulating the cell cycle which indicates that it should remain a candidate. MHC genes can play roles in transformation and metastasis in many ways, so those genes are all left as candidates.


Pas3 was first described by Devereux et al9 as flanked by D19MIT42 and D19MIT19 using linkage analysis of N-ethyl-N-nitrosourea treated (A/J × C57BL/6J) Fl × C57BL/6J backcross progeny. This observation was later confirmed by Festing et al16 using urethane treated (A/J × C57BL/6J) F2 mice. Pas3 oncogenic candidates include golgi specific brefeldin A resistance factor 1 and semaphorin 4G. Discordantly expressed non-candidate genes having interesting backgrounds include apoptosis protein MA-3 and cdc25 homologue A. Brefeldin A is a drug used to induce apoptosis in adenocarcinomas through a process that blocks ADP-ribosylation of proteins in the golgi apparatus. When this process is blocked, proteins awaiting ribosylation accumulate, eventually triggering an “ER stress” induced apoptosis signal involving caspases 9 and 12. The BFA resistance factor transfers ribosylation factors to proteins such that the drug effect is mitigated.39 The precise role of semaphorin 4G is not known, but other semaphorins, such as M-semaH, are implicated in metastatic potential.40 The beta 1 adrenergic receptor, along with the beta 2 receptor in Par2, is a candidate although publications on the role of these in apoptosis is mixed. Some results suggest that beta 2 receptor expression is the key to sensitising cells to cisplatin induced apoptosis,41 others that coactivation of beta 1 and beta 2 receptors correlates with NK resistance to metastasis,42 and others that they both work along with cyclooxygenase 2 to produce arachidonic acid metabolites that induce apoptosis.43 Incidentally, COX-2 (chromosome 1) showed very low or no expression in any of the tissues.

Interestingly, this locus enhances lung tumour multiplicity more significantly when K-ras has a heterozygous genotype as compared with resistant homozygotes. The oncogenic candidates primarily affect apoptosis pathways. With heterozygous Ras and susceptible Pas3, one would have a situation in which cells have the propensity to divide and be resistant to apoptosis. This could be a model for the relationship between Pas3 and Pas1.


Pas4 was mapped by Festing et al.16 The expected candidate associated with this locus in gene mapping studies is TGFβ receptor II, but was not differentially regulated in this study. Oncogene candidates are SH2/SH3 adapter protein (Nck), parathyroid hormone receptor, similar to topoisomerase II binding protein, and NAT-1. Nck, a known proto-oncogene, seems to play pluripotent roles in processes including translation and actin polymerisation.44–46 Nck has been shown to bind Wasbp, a Pas4 tumour suppressor candidate listed below, which also performs a role in actin polymerisation.45 Parathyroid hormone and parathyroid hormone receptor have both been shown in vitro to be expressed in type II alveolar epithelial cells and in an adenocarcinoma cell line and thus have been postulated as forming an autocrine loop.47 TopBP1 has been shown to be involved in DNA damage repair and checkpoint control.48 This function may reduce rates of apoptosis.48 Interestingly, another topoisomerase II binding protein found in Par4, DNA polymerase epsilon subunit 3, has a human homologue, YB-1, whose expression pattern parallels that of PCNA in adenocarcinoma.49 This provides two topoIIBPs which promote cell survival or proliferation. NAT1, a highly polymorphic N-acetyltransferase known to activate and detoxify tobacco carcinogens, has been indicated for genotyping to assess risk for adenocarcinoma.50

Pas4 tumour suppressor candidates are G protein alpha I 2, cdk5, SMARC D3, and Wiskott-Aldrich syndrome binding protein. Gαi2 does not have a clear role in cancer, although some work has been done on oncogenic cell signalling through pathways involving it. Should it have a role, these data suggest a tumour suppressive one. Cdk5 has been suggested to have a role in apoptosis of glioblastoma multiforme cells, as well as in non-neural cells through a cAMP dependent pathway.51 Wasbp was discussed above. SMARC D3, involved in chromatin remodelling, has been postulated to operate as a tumour suppressor in mice and humans.52

A number of Pas4 genes appear to be associated with cancer through cytoskeletal interactions. For example, Nck-1 connects the ras pathway with cytoskeletal signals.53 Wiskott-Aldrich syndrome involves dysregulation of cytoskeletal signalling through actin binding and WASBP is a candidate.54 Stromal antigens, also related to cytoskeletin, can be signalling molecules during metastasis. SMARCs are actin dependent chromatin regulators of which SMARC D3 is differentially regulated in the direction favouring carcinogenesis. It has been suggested from data in Drosphila that topoisomerase II and actin can regulate chromatin remodelling and thus gene transcription.55 The theme for Pas4 appears to be a linkage between cytoskeletal signals and chromatin or gene expression. Although TGF β receptor was not seen differentially regulated here, it has been suggested that one effect of TGF during tumorigenesis is to induce cytoskeletal reorganisation.56



Par1 was first mapped by Manenti et al13 by crossing Pas1 positive A/J × M Spretus F1 mice with B6 mice. Later, Patear et al14 published results using backcrosses of ((A/J × S RI) × A/J)F1 × A/J which confirmed Par1 on chromosome 11 and mapped Par3 on chromosome 12. In Par1, A/J is the susceptible strain and SM/J the resistant strain. Par1 contains the retinoic acid receptor and its allele has been suggested to modulate the Pas1 allele.12 RARα showed no difference in expression between A/J and SM/J. Oncogene candidates are 12-lipoxygenase, zinc finger protein s11-6, granulin, and ribosomal protein L29. Inhibition of 12-lipoxygenase has been shown to produce apoptosis in prostate cancer, gastric cancer, and other cancers.57,58 It has been shown that a 12-lipoxygenase metabolite, 12-HETE, phosphorylates ERK which consequently induces proliferation of cancer cells and this is postulated as a mechanism for tumour cell proliferation in vivo.58 In Pas1, several ERK pathway modulators are present including cyclophilin H, cyclin D2, and ECA39, any or all of which could link these loci in the fashion suggested by previous genetic studies. The candidate zinc finger s11-6 seems to have unknown function. However, with a nearly certain role in DNA binding, it remains a candidate. Granulin has a precursor form called PC cell derived growth factor (PCDGF) which has been shown to mediate oestradiol induced mitosis in breast cancer by activating MAPK and cyclin D1.59 Here there is a plausible link to Pas1 via cyclin D2, rather than cyclin D1. Expression of RPL29, a heparin sulphate interacting protein, has been correlated in colorectal carcinoma with metastatic status.60

The Par1 candidate tumour suppressor is speckle type POZ protein. SPOP has been suggested through bioinformatics to interact with the tumour necrosis alpha receptor and thus may play a role in apoptosis.61


In Par2, A/J is the susceptible strain and BALB/cJ the resistant. The candidate oncogenes are beta-2 adrenergic receptor, methyl-CpG binding domain protein 2, serotonin receptor, high mobility group protein 1, and interferon G induced GTPase. All except the serotonin receptor were discussed above. Serotonin receptor has been implicated as having opposing dual roles in cancer. On the one hand, 5HT stimulation produces vasoconstriction and thus limits blood supply to tumours. On the other hand, tumour cells expressing receptor produce an autocrine loop that supports aggressive proliferation.62 These data, assuming it is the Par2 gene, suggest that its role in proliferation is more important. Expression of IFN gamma induced GTPase has been shown to correlate with proliferation rate of fibroblast cells in vitro.63 This is consistent with an oncogenic role.

The major gene on distal chromosome 18 has previously been reported,15,64 though chromosome 18 markers are only occasionally deleted in mouse lung tumours.64,65 Expected candidate genes are Dcc (deleted in colorectal cancer) and homologues of DPC4 and JVI8-1 which have been shown to be deleted in a small number of non-small cell lung cancers.64 The Mcc (mutated in colon cancer) and Apc (adenomatous polyposis coli) genes, shown to have decreased expression in mouse lung tumours,66 map to a different region of chromosome 18. In this study, only Dcc passed the Z cutoff but was not considered a candidate as its expression is higher in A/J. In this study, we observed differential expression of high mobility group protein 1 which non-specifically binds DNA and also binds p53, and by computer analysis performed by others putatively binds methyl-CpG binding domain protein 2, and heterogeneous nuclear ribonucleoprotein K.33 Since some of these are activators and others repressors, this suggests that the transcriptional machinery may have a different set point in A/J versus Bc or B6.


In Par3, one candidate oncogene appears, placental growth factor. P1GF has been shown to bind to VEGF1 receptor and to play an important role promoting angiogenesis in wound healing, cancer, etc.67 Genetic studies have shown that having the Par2 A/J allele and the Par3 SM/J allele confers increased resistance to tumorigenesis.12 The data here are as consistent as follows. The SM/J allele of P1GF, having lower expression, would be expected to result in a lower rate of angiogenesis, thus inhibiting tumour growth. Simultaneously, the A/J allele of Par2, providing increased expression of serotonin receptor, would augment vasoconstriction, further inhibiting tumour growth in spite of other factors supporting tumour proliferation.


In Par4, BALB/cJ is the susceptible strain, A/J the resistant, since, in a backcross, hybrid mice containing the A/J allele for Par4 are more resistant to tumours than those carrying the BALB/cJ allele.16 Increased resistance produced by the A/J allele is pronounced in males. Interestingly, this locus is closely linked to D4Mit77 and therefore also to the Cdkn2a (pI6INK4a) locus.68 Chromosome 4 markers in this region are often deleted in mouse lung adenocarcinomas,17,65,69 skin carcinomas,70 and hepatocellular carcinomas,71 and the human homologue on chromosome 9p21 is similarly deleted in some human tumours.72 Our results are consistent with this in that p16INK4a shows higher expression in A/J. Candidate tumour suppressors are EGF-like domain multiple 5, bikunin, DNA polymerase epsilon subunit 3 (pole3), tyrosinase related protein 1, interferon alpha gene B, and IGFBP-like protein. Once again, candidacy was determined as follows. EGF-like domain multiple 5 is an uncharacterised gene, but it may play a role in growth factor signalling, so it remains a candidate. Bikunin has been shown to inhibit metastasic processes when overexpressed.73 However, in colorectal carcinoma, expression was observed in both tumour and normal tissue with no discernible difference.74 Pole3 was discussed above. Tyrosinase related protein 1 is involved in melanogenesis and coat colour. Interferon alpha induces apoptosis and some data suggest that it is mediated by a c-myc dependent pathway.75 Insulin-like growth factor binding proteins are known to inhibit cell growth and promote apoptosis by binding the growth factors and thus blocking them from interacting with their receptors.76 This role is consistent with the results here.

Candidate oncogenes are T complex protein 1 alpha and stathmin. TCP-1, a chaperonin involved in cytoskeletal protein folding, has been shown to be upregulated in colon cancer and to be a member of the complex that includes von Hippel-Lindau (VHL) protein.77,78 Stathmin overexpression has been reported to correlate with proliferation in ovarian cancer.79 Overall, the candidates observed in this data set are involved in cell cycling or apoptosis and seem to be clustered near the position of D4Mit77.


The most efficient and effective way to look for differences in expression of the genes in a given QTL is through microarray technology. It can be speculated that at least some causative genes will be differentially expressed, so candidates obviously can be sought among genes found to be differentially expressed. This dataset illustrates that candidates come from several regulatory areas including apoptosis, cytoskeletal organisation, chromatin modelling, and cell cycle. As such, susceptibility and resistance to cancer involves a constellation of factors, some interacting, and failure of any one can lead to carcinogenesis. This observation may impact on both treatment and further research. It may be that the next stage in personalised medicine for cancer will initially involve a tumour work up to establish whether the primary aberration involves the cell cycle, apoptosis, or cytoskeleton and from there, which gene and finally how best to intervene. In other words, start with functional assays then follow up with genotyping and intervention strategies. The concentration, diversity, and multiplicity of genes found within these QTLs to be reported in other studies as aberrant in cancerous samples suggests that fine genetic mapping may be a problematical approach. Fine mapping may be most appropriate for conditions with a strong suggestion of monogenicity.

Our statistical threshold was selected to reduce the false negative rate, since omission of a candidate is more problematical than carrying forward non-candidates. Just as methods to reduce the false positive rate, such as Bonferroni correction, increase the false negative rate, our choice of reducing the false negative rate necessarily increases the false positive rate. Managing false positives was accomplished in several ways. First we focused analysis on genetically determined QTLs. Second, we leveraged published reports to apply what has already been observed about specific genes in cancer. This approach has the potential pitfall that our analysis may be restricted by current thinking. However, we have noted all the genes which have been observed to be associated with cancer, even when expression here appears discordant with reported observations. Some of these, such as ATFa modulator in Pas1 and PDCD4 in Pas3, would be interesting to follow up on.

The apparent concentration of highly cancer associated genes within these QTLs, regardless of their concordant or discordant expression in this study, suggests a model for carcinogenesis in which genomic position plays a major role. Suppose the genomic positions of these QTLs were used during mitosis for both DNA replication and another DNA involving process, such as transcription or segregation, in a fashion that permitted local replication errors at a rate much higher than that for more distant positions. This would produce a situation in which mutations are regionally concentrated at these foci and thus could affect any number of proximate genes. Any such regions containing genes involved in sensitive, necessary pathways would be associated with cancer, but analysis of any given tumour would often show several mutated genes. Carcinogenesis per se would still involve failure of cell cycle regulation or apoptosis, but the mechanism for mutation would involve an interference of one DNA involved process such as transcription or segregation with DNA replication. These candidate genes are currently being verified using RT-PCR or northern analysis and the results will be reported in the near future.


In summary, we have identified a number of candidates for lung cancer susceptibility based on their concordant allele specific differential gene expression. In addition, this study shows the usefulness of genome wide expression profiling using microarrays in conjunction with QTL mapping in the identification of genes responsible for genetic traits. We believe that some of the identified candidates are functionally relevant and that their expression concords with genetic studies and thus should be selected for further examination. Accordingly, a series of experiments based on the information from the present study are under way to determine allelic variations and allele specific functional differences in lung tumorigenesis.


We are grateful to F Wright, A de la Chapelle, and G Stoner for their critical reading of this manuscript and helpful discussions. This work was supported by NIH grants R01CA58554 (MY), R01CA78797 (YW), and P30CA16058.



  • * These two authors contributed equally to this work.