CpG Islands in vertebrate genomesā˜†

https://doi.org/10.1016/0022-2836(87)90689-9Get rights and content

Abstract

Although vertebrate DNA is generally depleted in the dinucleotide CpG, it has recently been shown that some vertebrate genes contain CpG islands, regions of DNA with a high G + C content and a high frequency of CpG dinucleotides relative to the bulk genome. In this study, a large number of sequences of vertebrate genes were screened for the presence of CpG islands. Each CpG island was then analysed in terms of length, nucleotide composition, frequency of CpG dinucleotides, and location relative to the transcription unit of the associated gene. CpG islands were associated with the 5ā€² ends of all housekeeping genes and many tissue-specific genes, and with the 3ā€² ends of some tissue-specific genes. A few genes contained both 5ā€² and 3ā€² CpG islands, separated by several thousand base-pairs of CpG-depleted DNA. The 5ā€² CpG islands extended through 5ā€²-flanking DNA, exons and introns, whereas most of the 3ā€² CpG islands appeared to be associated with exons. CpG islands were generally found in the same position relative to the transcription unit of equivalent genes in different species, with some notable exceptions.

The locations of GC boxes, composed of the sequence GGGCGG or its reverse complement CCGCCC, were investigated relative to the location of CpG islands. GC boxes were found to be rare in CpG-depleted DNA and plentiful in CpG islands, where they occurred in 3ā€² CpG islands, as well as in 5ā€² CpG islands associated with tissue-specific and housekeeping genes. GC boxes were located both upstream and downstream from the transcription start site of genes with 5ā€² CpG islands. Thus, GC boxes appeared to be a feature of CpG islands in general, rather than a feature of the promoter region of housekeeping genes.

Two theories for the maintenance of a high frequency of CpG dinucleotides in CpG islands were tested: that CpG islands in methylated genomes are maintained, despite a tendency for 5mCpG to mutate by deamination to TpG + CpA, by the structural stability of a high G + C content alone, and that CpG islands associated with exons result from some selective importance of the arginine codon CGX. Neither of these theories could account for the distribution of CpG dinucleotides in the sequences analysed. Possible functions of CpG islands in transcriptional and post-transcriptional regulation of gene expression were discussed, and were related to theories for the maintenance of CpG islands as ā€œmethylationfree zonesā€ in germline DNA.

References (86)

  • A. Bird et al.

    Cell

    (1985)
  • S.J. Compere et al.

    Cell

    (1981)
  • D.N. Cooper et al.

    Cell Different

    (1985)
  • H.K. Das et al.

    J. Biol. Chem

    (1985)
  • W.S. Dynan

    Trends Genet

    (1986)
  • W.S. Dynan et al.

    Cell

    (1983)
  • J. Eldridge et al.

    Gene

    (1985)
  • S. Henikoff et al.

    Cell

    (1986)
  • K.A. Jones et al.

    Cell

    (1985)
  • J. Josse et al.

    J. Biol. Chem

    (1961)
  • J.T. Kadonaga et al.

    Trends Biochem. Sci

    (1986)
  • K. Kohno et al.

    J. Biol. Chem

    (1985)
  • M.W. Lieberman et al.

    Cell

    (1983)
  • M. McGrogan et al.

    J. Biol. Chem

    (1985)
  • C. McKeon et al.

    Cell

    (1982)
  • R. Mitchell et al.

    Cell

    (1985)
  • M. Perry et al.

    J. Mol. Biol

    (1985)
  • G.A. Reynolds et al.

    Cell

    (1984)
  • J. Singer-Sam et al.

    Gene

    (1984)
  • M.A. Stepp et al.

    J. Biol. Chem

    (1986)
  • M.N. Swartz et al.

    J. Biol. Chem

    (1962)
  • R. Treisman

    Cell

    (1985)
  • P.J. Venta et al.

    Biochim. Biophys. Acta

    (1985)
  • R.L.P. Adams et al.

    Nucl. Acids Res

    (1984)
  • G. Battistuzzi et al.
  • M. Bienz

    EMBO J

    (1984)
  • A.P. Bird

    Nucl. Acids Res

    (1980)
  • A.P. Bird

    Nature (London)

    (1986)
  • J.-M. Blanchard et al.

    Nature (London)

    (1985)
  • J.R. Brown et al.

    Mol. Cell. Biol

    (1985)
  • C.A. Bucholtz et al.

    Nucl. Acids Res

    (1986)
  • D.W. Cleveland et al.

    J. Cell Biol

    (1983)
  • C. Coulondre et al.

    Nature (London)

    (1978)
  • G.F. Crouse et al.

    Mol. Cell. Biol

    (1985)
  • M. Dean et al.

    Mol. Cell. Biol

    (1986)
  • C. D'Onofrio et al.

    EMBO J

    (1985)
  • T.J. Dull et al.

    Nature (London)

    (1984)
  • M.K. Dush et al.
  • W.S. Dynan et al.
  • W.S. Dynan et al.

    Nature (London)

    (1986)
  • P.J. Farnham et al.
  • R.J. Focht et al.

    Mol. Cell. Biol

    (1984)
  • D. Gidoni et al.

    Nature (London)

    (1984)
  • Cited by (0)

    ā˜†

    This work was supported by a project grant (860266) from the National Health and Medical Research Council of Australia.

    View full text