Cohesin: genomic insights into controlling gene transcription and development

https://doi.org/10.1016/j.gde.2011.01.018Get rights and content

Over the past decade it has emerged that the cohesin protein complex, which functions in sister chromatid cohesion, chromosome segregation, and DNA repair, also regulates gene expression and development. Even minor changes in cohesin activity alter several aspects of development. Genome-wide analysis indicates that cohesin directly regulates transcription of genes involved in cell proliferation, pluripotency, and differentiation through multiple mechanisms. These mechanisms are poorly understood, but involve both partial gene repression in concert with Polycomb group proteins, and facilitating long-range looping, both between enhancers and promoters, and between CTCF protein binding sites.

Introduction

Structural Maintenance of Chromosome (SMC) protein complexes play essential roles in chromosome mechanics, including chromosome segregation, condensation, and DNA repair, in both prokaryotes and eukaryotes [1]. Eukaryotes have three SMC complexes: condensin, Smc5/6, and cohesin. Cohesin mediates sister chromatid cohesion to ensure proper chromosome segregation in mitosis and meiosis, and condensin is required for chromosome condensation. Both Smc5/Smc6 and cohesin play roles in DNA repair. Cohesin and specialized condensin complexes also regulate gene expression, with consequences for development [2]. Here we review the roles of cohesin in gene regulation and development in higher organisms, and how recent genome-wide analyses have shed light on possible mechanisms.

The cohesin complex consists of the Smc1, Smc3, Rad21, and Stromalin/Stag1/2 proteins, which form a ring-like structure (Figure 1) [3]. The Nipped-B–Mau-2 complex and ATPase activities of the Smc1 and Smc3 head domains are required for cohesin to bind to chromosomes [3]. There is evidence that cohesin encircles DNA topologically, but uncertainty about how cohesin actually holds sister chromatids together [3, 4].

The first observations linking cohesin to gene regulation during development were the recovery of Nipped-B mutations in a genetic screen for factors that facilitate activation of the Drosophila cut and Ultrabithorax homeobox genes by distant transcriptional enhancers [5], and identification of the human ortholog, Nipped-B-Like (NIPBL), as the gene mutated in many cases of Cornelia de Lange syndrome (CdLS) [6, 7]. CdLS displays diverse structural and intellectual deficits, including slow growth, upper limb and organ dysmorphologies, distinctive facial features, mental retardation, and autism spectrum disorders [8].

CdLS is caused by heterozygous loss-of-function NIPBL mutations that reduce expression at most by 30%. As little as a 15% decrease in NIPBL expression is sufficient to cause CdLS [9]. Heterozygous Drosophila Nipped-B null mutations reduce Nipped-B mRNA levels by less than 30%, and decrease expression of mutant and wild-type cut alleles in the developing wing without causing cohesion or chromosome segregation defects in mitosis or meiosis [10, 11]. Heterozygous Nipbl mutant mice show a 30% decrease in Nipbl mRNA and display several developmental defects overlapping those seen in CdLS, and many changes in gene expression [12]. Some 75% of Nipbl(+/−) mice perish perinatally, but there are no effects on cohesion or chromosome segregation.

The lack of overt defects in chromatid cohesion or chromosome segregation with reduction of Nipped-B/NIPBL activity in multiple organisms argues that the developmental deficits derive primarily from altered gene expression. More evidence favoring this idea came from experiments in which cohesin function was severely disrupted in the non-dividing mushroom body γ neuron in Drosophila, which blocked axon pruning by reducing expression of the ecdysone receptor gene (EcR) [13, 14].

It is unlikely that Nipped-B/NIPBL has developmental functions separate from cohesin. Genome-wide analysis reveals that knockdown of Nipped-B and cohesin in Drosophila cells alters expression of the same several hundred genes [15]. Moreover, 5% of CdLS cases are caused by mutations affecting the SMC1A cohesin subunit, and one by an SMC3 mutation [16, 17]. These cases are milder than those caused by NIPBL mutations, showing primarily intellectual deficits and the characteristic facial features. All the SMC1A mutations and the SMC3 mutation maintain the open reading frame, and cause amino acid substitutions and/or small deletions in the protein. SMC1A is X-linked, and both hemizygous male and heterozygous female individuals have been identified, indicating that the mutant proteins are functional and dominantly interfere with development. Like heterozygous NIPBL mutations [7, 18], the SMC1A mutations do not affect chromatid cohesion or chromosome segregation [19].

The finding that mild disruption of Nipped-B/NIPBL and cohesin activity alters gene expression and development raises the question of how these disruptions affect cohesin binding to chromosomes. One possibility is that only low cohesin binding is needed for cohesion and segregation, and higher levels are needed for gene regulation. Consistent with this idea, knockdown of Nipped-B or cohesin by 80% in Drosophila cells causes no significant cohesion or segregation defects but alters expression of some genes several-fold [15], and systematic reduction of cohesin in yeast reveals that 13% of normal levels is sufficient for cohesion and chromosome segregation, although chromosome condensation and DNA repair are affected [20].

In vivo fluorescence recovery after photobleaching (FRAP) experiments in Drosophila salivary glands reveals that cohesin binds chromosomes in two modes—a weak binding mode with a chromosomal half-life of some 20 s, and a stable mode with a half-life of 6 min or so [21]. A heterozygous null Nipped-B mutation reduces the fraction of stable-binding cohesin by a third, suggesting that the stable binding mode is crucial for gene regulation. FRAP experiments, however, do not address whether or not stable binding, which is postulated to be topological, is affected more at some genes than others.

Drosophila Nipped-B and the yeast Scc2 ortholog both show nearly identical chromosome-binding dynamics as cohesin, suggesting that a significant fraction interacts tightly with cohesin on chromosomes, although Nipped-B's stoichiometry to cohesin varies dramatically between tissues [21, 22]. More precise knowledge of how cohesin binding is affected by reduced Nipped-B/NIPBL levels and SMC1A missense mutations is needed to fully understand how cohesin regulates transcription.

Cohesin binding has been mapped genome-wide by ChIP-chip, DamID, or ChIP-seq in Drosophila cell lines [23], Drosophila salivary gland [24], HeLa cells [25], human lymphocytes [26], MCF-7 and HepG2 tumor cell lines [27] and mouse embryonic stem cells [28], and in part of the genome of mouse pre-B cells and thymocytes [29]. In Drosophila, Nipped-B and cohesin completely co-localize and bind preferentially to active genes, with peaks near the transcription start sites [23]. Cohesin prefers active genes and peaks near transcription start sites in mammalian cells, but also co-localizes with a large fraction of the CCCTC-binding factor (CTCF) binding sites [25•, 26•, 27•, 28•, 29•, 30, 31]. CTCF has multiple functions in gene regulation, but its insulator role in blocking enhancer–promoter interactions is the best characterized [32].

Remarkably, Nipbl co-localizes with cohesin at transcription start sites and transcriptional enhancers in mouse embryonic stem cells, but not at CTCF sites [28]. Given that Nipbl is required to load cohesin onto chromosomes, and barring potential issues such as weak crosslinking or epitope masking of Nipbl at CTCF sites, this implies that cohesin gets to CTCF sites either by sliding from Nipbl sites, or that cohesin binds differently at CTCF sites. CTCF knockdown does not cause cohesion defects or reduce overall cohesin binding, indicating that CTCF sites are not crucial for cohesion [25•, 29•]. In Drosophila, cohesin does not co-localize with CTCF, but cohesin (and Nipped-B) co-localize with a fraction of sites binding the CP190 co-insulator protein [33]. Thus a relationship between cohesin and insulators may be evolutionarily conserved, even if localization with CTCF is not.

In Drosophila, cohesin and Nipped-B are excluded from regions marked by the histone H3 lysine 27 trimethylation modification made by the Enhancer of zeste [E(z)] Polycomb group (PcG) silencing protein [23]. Most PcG-targeted genes are silenced, and genes that are silenced in one cell line can bind Nipped-B and cohesin in other cells in which they are transcribed. As described below, there are important rare exceptions in which genes bind cohesin and PcG proteins simultaneously, and are hypersensitive to Nipped-B and cohesin levels [15].

Hundreds of genes are altered in expression in lymphocytes derived from CdLS individuals with NIPBL or SMC1A mutations [26], in Nipbl(+/−) mouse tissues [12], in Drosophila cells [15] and mouse embryonic stem cells [28] when Nipped-B/NIPBL or cohesin are knocked down by RNAi, and in Drosophila salivary glands when cohesin is proteolytically removed from chromosomes [24]. Roughly equal numbers of genes increase and decrease in expression, and most effects are less than 2-fold. In Drosophila cells and salivary glands, however, large effects on the order of 100-fold also occur.

Comparison of genome-wide binding of cohesin to the genome-wide effects of reducing Nipped-B/NIPBL or cohesin activity on transcript levels reveals that the genes that change in expression with altered cohesin activity are highly enriched for cohesin-binding genes [15•, 24•, 26•, 28•]. In Drosophila cells and mouse embryonic stem cells, there is a strong correlation between the effects of Nipped-B/Nipbl and cohesin knockdown. This is compelling evidence that Nipped-B/NIPBL regulates genes through control of cohesin chromosome binding, and that cohesin directly regulates gene transcription. Potential indirect effects, however, make it difficult to be sure how much of the effect on any individual gene is direct.

Recent experiments provide strong evidence for direct regulation of specific genes by cohesin in Drosophila salivary glands [24]. By replacing the native Rad21 subunit with a form that can be cleaved by TEV protease to remove cohesin from chromosomes, and using a temperature-sensitive system to induce TEV protease, decreases in expression of some genes in the ecdysone steroid signaling pathway were detected within 4 h, and in ecdysone-induced chromosome puffing within 2 h. By immunostaining, the EcR activator remains bound to the Eip74EF gene while it decreases 10-fold in expression, although EcR expression is also reduced.

Cohesin preferentially regulates genes important for development and cell proliferation (Figure 1). The top ontology categories for genes that increase in expression with cohesin or Nipped-B/Nipbl knockdown involve development in Drosophila and mouse embryonic stem cells [15•, 28•]. The top categories for genes that increase in expression with proteolysis of Rad21 in Drosophila salivary glands are metabolic processes, followed by development, but in these experiments, cohesin removal was nearly complete, triggering acute changes in cellular physiology that probably affect metabolism [24].

As described above, cohesin facilitates expression of genes in the ecdysone steroid hormone signaling pathway in the salivary gland and the mushroom body γ neuron (Figure 1) [13, 14, 24•]. Ecdysone is a key regulator of morphogenesis and molting in Drosophila, raising the possibility that cohesin might also play a vital role in these developmental processes. Cohesin also facilitates steroid hormone signaling in human MCF-7 breast cancer cells, where it co-localizes with the estrogen receptor on target genes (Figure 1) [27]. Cohesin knockdown decreases re-entry of these cells into the cell cycle in response to estrogen treatment, indicating that cohesin regulates estrogen function. The finding that cohesin influences both ecdysone and estrogen signaling raises the question if it also regulates other steroid hormone pathways.

A remarkably consistent finding is that cohesin binds and facilitates expression of the myc gene in all species examined, including Drosophila, zebrafish, mouse, and human (Figure 1) [12•, 15•, 26•, 28•, 30, 31, 34•]. Myc is a key regulator of protein synthesis and cell proliferation, and this may explain why protein translation is the top ontology category for genes that decrease in expression in Drosophila cells when Nipped-B or cohesin is knocked down [15]. Many genes directly activated by Myc that do not bind cohesin also decrease in expression with Nipped-B or cohesin knockdown in Drosophila cells, and the list of affected genes is nearly identical to that determined for myc mutant larvae [34•, 35].

In addition to myc, Nipbl and cohesin also promote expression of genes required for pluripotency in mouse embryonic stem cells (Figure 1) [28]. Nipbl, cohesin, and the Mediator transcriptional coactivator complex were identified in an RNAi screen for factors required for stem cell maintenance, and found to directly facilitate expression of the Oct4 and Nanog pluripotency genes. Upregulation of myc and pluripotency genes by cohesin, combined with downregulation of several differentiation genes makes it tempting to speculate that cohesin provides input into the decision between proliferation versus differentiation.

The mechanisms by which Nipped-B/NIPBL and cohesin regulate gene expression are not well understood, but there is growing evidence that cohesin facilitates long distance DNA looping over several kilobases (Figure 2). This idea was put forth to explain genetic effects of Nipped-B mutations on Drosophila cut expression in the developing wing margin [5], but ironically, new evidence described below suggests that partial repression by cohesin might also be involved. Direct evidence for a role for cohesin in looping comes from the finding that cohesin knockdown reduces long-range interactions between CTCF sites in the IFNG, apolipoprotein, Igf2-H19, and β globin genes measured by chromosome conformation capture (3C) (Figure 2) [36•, 37•, 38•, 39•]. In most cases, the reduced long-range interactions are accompanied by modest changes in gene expression. The effects on looping range from modest reduction to complete loss. Given the experimental variables, which include potential effects of reduced protein binding on formaldehyde crosslinking in 3C experiments, and incomplete cohesin knockdown, these results must be interpreted cautiously, but they suggest that cohesin performs a secondary stabilization role in some cases, but is essential for looping in others.

Cohesin binding predicts enhancer–promoter loops in several active genes in mouse embryonic stems cells (Figure 2) [28]. In these cases there are overlapping peaks of the Mediator coactivator, Nipbl and cohesin on both the enhancer and promoter, and an interaction between them is detected by 3C. The cohesin/Mediator peaks and loops are absent in embryonic fibroblasts in which these genes are inactive. Cohesin knockdown reduces the enhancer–promoter interaction in the Nanog gene in stem cells, correlating with a substantial decrease in Nanog expression.

The mechanisms by which cohesin facilitates the long-range interactions between CTCF sites and between enhancers and promoters are unknown, but it has been speculated that cohesin might mediate ‘intrachromosomal cohesion’ to stabilize loops (Figure 2) [40]. The finding that Nipbl is not present at the cohesin/CTCF peaks in mouse embryonic stem cells [28], however, suggests that the mechanism may be different for CTCF and enhancer–promoter loops.

In a Drosophila cell line derived from central nervous system, genes that increase in expression with Nipped-B or cohesin knockdown are more enriched for cohesin-binding than the genes that decrease [15]. Thus, cohesin has more direct repressive than activating effects. While most effects are modest, some genes increase dramatically in expression with cohesin knockdown, up to a hundred-fold. Most strongly affected genes, which include the Enhancer of split [E(spl)-C] and invected-engrailed complexes, are rare exceptions where a cohesin-binding domain overlaps a region targeted by Polycomb group (PcG) silencing proteins (Figure 3). In contrast to genes that are targeted only by PcG proteins, the co-targeted genes are expressed at modest to moderate levels. Knockdown of Polycomb also increases E(spl)-C expression, indicating that PcG proteins and cohesin are both needed to restrain transcription. It is unlikely that cohesin and PcG proteins target the genes in different cells in the population because genes that are targeted only by PcG proteins are unaffected by cohesin knockdown, and the E(spl)-C is targeted only by cohesin in another cell line and does not respond to cohesin knockdown. It is unknown if this partial repression affects activator binding, or another step in transcription.

Some cohesin-PcG genes co-targeted genes show a biphasic response to cohesin levels—when cohesin is reduced by only 30%, E(spl)-C expression decreases, but when cohesin is reduced by 50–80% expression increases [15]. This led to the hypothesis that cohesin-PcG co-targeted domains have a unique structure that depends on a balance between cohesin and PcG proteins. With a slight cohesin reduction, PcG silencing activity may become stronger, decreasing transcription, but when cohesin is strongly reduced, the structure is lost, leading to unrestrained transcription.

Remarkably, genetic effects reminiscent of this biphasic effect are seen in vivo, with partial Nipped-B and cohesin reduction having opposite effects on expression of cut in the wing margin, and on an E(spl)-C dependent mutant eye phenotype [10, 15•, 41]. Even more remarkably, cut is active and bound by cohesin over a domain extending from the wing margin enhancer to the end of the transcribed region in one Drosophila cell line, but in contrast to the developing wing margin, Nipped-B or cohesin knockdown has no effect on cut transcription in these cells [15]. Thus cohesin binding alone is insufficient for cut to be sensitive to cohesin dosage.

Genetic experiments suggest that PcG proteins regulate but do not silence cut in the developing wing margin [42], and PcG proteins fully silence cut in some Drosophila cell lines [23]. Combined with the opposite in vivo effects of Nipped-B and cohesin mutations these findings raise the possibility that cohesin acts in concert with PcG proteins to restrain, but not silence cut expression in the wing margin, which does not exclude the possibility that cohesin also facilitates long-range activation by the wing margin enhancer.

Nearly all genes co-targeted by cohesin and PcG proteins in Drosophila cells are bivalent, having both histone H3 lysine 4 trimethylation (H3K4me3) characteristic of active genes, and the histone H3 lysine 27 trimethylation (H3K27me3) modification made by the E(z) PcG protein [15]. Bivalent genes are common in embryonic stem cells, and like the Drosophila E(spl)-C and engrailed genes, many encode transcription factors, and are expressed at low levels [43]. Current thought is that they represent an uncommitted multipotent state, but it is interesting to speculate that the bivalent state can also ensure that a gene is expressed at an appropriate level that is not too low or too high. Notably, more than half of the 200 genes that increase the most in expression with cohesin knockdown in mouse embryonic stem cells, many of which bind cohesin [28], are bivalent [44]. Thus partial repression of select PcG-targeted genes may also be an important mechanism for gene regulation by cohesin in embryonic stem cells, although it remains possible some of these effects reflect reduced pluripotency.

Section snippets

Key questions

The evidence summarized above raises many intriguing questions: What are the molecular mechanisms by which cohesin contributes to repression of selected PcG-targeted genes, and how is it determined that a gene is targeted only by PcG proteins in one cell type, and by both cohesin and PcG proteins in another? What are the molecular mechanisms by which cohesin facilitates long-range looping, and do they differ for enhancer–promoter loops and loops between CTCF sites? Is stable topologically bound

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

Work on the role of cohesin in gene expression and development in the author's laboratory is supported by grants from the NIH (R01 GM055683, P01 HD052860). The author thanks former and current members of his laboratory, including Robert Rollins, Patrick Morcillo, Ziva Misulovin, Maria Gause, Cheri Schaaf, and Avery Fay, and many colleagues, including Ian Krantz, Arthur Lander, Anne Calof, Antonio Musio, Julia Horsfield, Tom Strachan, Matthias Merkenschlager, Jennifer Gerton, Andrea Pauli, Kim

References (44)

  • K. Nasmyth et al.

    Cohesin: its roles and mechanisms

    Annu Rev Genet

    (2009)
  • R.A. Rollins et al.

    Nipped-B, a Drosophila homologue of chromosomal adherins, participates in activation by remote enhancers in the cut and Ultrabithorax genes

    Genetics

    (1999)
  • I.D. Krantz et al.

    Cornelia de Lange syndrome is caused by mutations in NIPBL, the human homolog of Drosophila melanogaster Nipped-B

    Nat Genet

    (2004)
  • E.T. Tonkin et al.

    NIPBL, encoding a homolog of fungal Scc2-type sister chromatid cohesion proteins and fly Nipped-B, is mutated in Cornelia de Lange syndrome

    Nat Genet

    (2004)
  • D. Dorsett et al.

    On the molecular etiology of Cornelia de Lange syndrome

    Ann N Y Acad Sci

    (2009)
  • G. Borck et al.

    Father-to-daughter transmission of Cornelia de Lange syndrome caused by a mutation in the 5’ untranslated region of the NIPBL gene

    Hum Mutat

    (2006)
  • R.A. Rollins et al.

    Drosophila Nipped-B protein supports sister chromatid cohesion and opposes the stromalin/Scc3 cohesion factor to facilitate long-range activation of the cut gene

    Mol Cell Biol

    (2004)
  • M. Gause et al.

    Functional links between Drosophila Nipped-B and cohesin in somatic and meiotic cells

    Chromosoma

    (2008)
  • S. Kawauchi et al.

    Multiple organ system defects and transcriptional dysregulation in the Nipbl(+/-) mouse, a model of Cornelia de Lange Syndrome

    PLoS Genet

    (2009)
  • A. Pauli et al.

    Cell-type-specific TEV protease cleavage reveals cohesin functions in Drosophila neurons

    Dev Cell

    (2008)
  • A. Musio et al.

    X-linked Cornelia de Lange syndrome owing to SMC1L1 mutations

    Nat Genet

    (2006)
  • M.A. Deardorff et al.

    Mutations in cohesin complex members SMC3 and SMC1A cause a mild variant of Cornelia de Lange syndrome with predominant mental retardation

    Am J Hum Genet

    (2007)
  • Cited by (155)

    • Stromalin Constrains Memory Acquisition by Developmentally Limiting Synaptic Vesicle Pool Size

      2019, Neuron
      Citation Excerpt :

      Interestingly, our stromalinRNAi effects were able to rescue the modest learning impairments caused by unc104RNAi expression in DAn, which suggests that increasing synaptic vesicle content may provide a potential symptomatic treatment for patients with KIF1A mutations, who display intellectual disability, sensory and autonomic neuropathy, and spastic paraplegias (Ohba et al., 2015). Mutations in the highly conserved cohesin complex genes SMC1, SMC3, Rad21, and stromalin (STAG1/2 in mammals) are known to cause cohesinopathies, such as Cornelia de Lange Syndrome (Dorsett, 2011; Lehalle et al., 2017; Liu and Krantz, 2009; Mullegama et al., 2017). Our observations prompt the important question of whether alterations in the synaptic vesicle pool and synaptic communication underlie some of the phenotypes associated with the cohesinopathies.

    View all citing articles on Scopus
    View full text