Article Text

Download PDFPDF

Long-range regulation at the SOX9 locus in development and disease
  1. C T Gordon1,
  2. T Y Tan1,2,3,
  3. S Benko4,
  4. D FitzPatrick5,
  5. S Lyonnet4,6,
  6. P G Farlie1
  1. 1
    Craniofacial Development Laboratory, Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, Australia
  2. 2
    Department of Paediatrics, University of Melbourne, Parkville, Australia
  3. 3
    Genetic Health Services Victoria, Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, Australia
  4. 4
    INSERM U-781, Hôpital Necker-Enfants Malades, Paris, France
  5. 5
    MRC Human Genetics Unit, Institute of Genetic and Molecular Medicine, Edinburgh, UK
  6. 6
    Université Paris Descartes et Assistance Publique-Hôpitaux de Paris, Hôpital Necker-Enfants Malades, Paris, France
  1. Correspondence to Dr P Farlie, Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, VIC 3052, Australia; peter.farlie{at}


The involvement of SOX9 in congenital skeletal malformation was demonstrated 15 years ago with the identification of mutations in and around the gene in patients with campomelic dysplasia (CD). Translocations upstream of the coding sequence suggested that altered expression of SOX9 was capable of severely impacting on skeletal development. Subsequent studies in humans and animal models pointed towards a complex regulatory region controlling SOX9 transcription, involving ∼1 Mb of upstream sequence. Recent data indicate that this regulatory domain may extend substantially further, with identification of several disruptions greater than 1 Mb upstream of SOX9 associated with isolated Pierre Robin sequence (PRS), a craniofacial disorder that is frequently a component of CD. The translocation breakpoints upstream of SOX9 can now be clustered into three groups, with a trend towards less severe skeletal phenotypes as the distance of each cluster from SOX9 increases. In this review we discuss how the identification of novel lesions surrounding SOX9 support the existence of tissue specific enhancers acting over a large distance to regulate expression of the gene during craniofacial development, and we highlight the potential for discovery of additional regulatory elements within the extended SOX9 control region.

Statistics from

SOX proteins are a group of transcription factors that are characterised by the presence of a DNA binding high mobility group (HMG) domain. Members of the family play diverse roles during embryonic development, and several have been associated with human disease, including SRY, SOX2, SOX10 and SOX9.1 SOX9 was first characterised as a positional candidate gene for campomelic dysplasia (CD; OMIM 114290), a syndrome predominantly affecting skeletal and testis development, based on a series of chromosomal translocations upstream of the open reading frame.2 3 Coding sequence mutations in SOX9 in cytogenetically normal CD patients subsequently indicated haploinsufficiency as the most likely cause of the disorder, with most mutations assumed to impair the function of the HMG, dimerisation, or C-terminal transactivation domains.2 3 4 5 6 7 8 9 Studies in mice have confirmed an essential role in chondrogenesis. Sox9 is required at successive stages of cartilage differentiation—both during early mesenchymal condensation and subsequently during expression of cartilage matrix genes.10 11 12 13 In addition to the skeletal system, Sox9 is expressed in several other developing organs,14 15 16 and targeted deletion has revealed that Sox9 is required for differentiation of specific cell types in the heart, central nervous system (CNS), notochord, testis, pancreas, gut and inner ear.17 18 19 20 21 22 23 Sox9 also functions during development of the neural crest,24 25 26 27 a transient and migratory cell population that at cranial levels fills the branchial arches, giving rise to skeletal elements of the face. As for many key developmental genes, the diversity of tissue and stage specific SOX9 functions suggests a requirement for a complex regulatory mechanism, possessing sufficient flexibility to accurately control expression in cell types as disparate as chondrogenic precursors and testicular Sertoli cells.

Characterisation of a range of translocation breakpoints throughout a ∼1 Mb region upstream of SOX9 has previously hinted at the complexity of the genomic environment controlling SOX9 expression. In CD patients, the eponymous feature of the disorder, campomelia (bowing of the long bones), varies in severity depending on the distance of the chromosomal lesion from the SOX9 open reading frame. For the most distant lesions, this feature is absent, and the disorder is described as acampomelic campomelic dysplasia (ACD). It is thought that in these cases, a larger proportion of the SOX9 control region remains intact, allowing some aspects of skeletal development to occur normally. Recent data suggest that the regulatory domain controlling tissue specific expression extends over an even greater distance, with the identification of a number of translocations and microdeletions further than 1 Mb upstream of SOX9 in patients with isolated Pierre Robin sequence (PRS; OMIM 261800), a disorder affecting the craniofacial skeleton.28 29 Lesions have also been described ∼1.3–1.5 Mb downstream of SOX9,29 30 suggesting the total genomic domain regulating SOX9 expression may span as much as 3 Mb. As a result of these findings, a stronger correlation now appears to be emerging between the distance of the translocation breakpoint from SOX9 and the overall severity of the phenotype. In this review, we discuss the spectrum of lesions surrounding SOX9, and the evidence suggesting SOX9 expression is regulated by a number of tissue specific enhancers during development.

Long-range regulatory elements

The use of multiple, discrete regulatory elements distributed over a large range appears to be a means to achieve tissue specific and stage specific transcriptional control for a number of genes. The mechanisms by which distant elements interact with the proximal promoter of a target gene remain poorly defined. The identification of many putative regulatory elements in the human genome has been facilitated by comparison of non-coding sequence with that of genomes from distantly related vertebrates such as pufferfish or chicken.31 The assumption underlying such comparisons is that sequence conservation throughout evolution is a strong indicator of essential function. An enrichment for conserved non-coding elements (CNEs) has been demonstrated within “gene deserts” near genes known or suspected to be important developmental regulators,32 implying that CNEs might be required for tight control of these genes during development. Large scale screening of CNEs driving reporter gene expression in transgenic mice has demonstrated that a high proportion of predicted elements can indeed act as tissue specific enhancers.33 34

Some characteristics of the distribution of cis regulatory elements in vertebrate genomes and the roles they play in evolution and development have begun to emerge. Recent work has suggested that small scale sequence changes in regulatory elements have influenced the evolution of limb morphology in different mammalian species.35 36 The maximum distance over which long-range regulatory elements can act is unknown; however, an enhancer for Shh is located at ∼1 Mb upstream, within an intron of a neighbouring gene. Mutations within this element result in polydactylous phenotypes in mice and humans.37 By analysing the distribution of CNEs co-duplicated with paralogous members of gene families, it has been estimated that a proportion of CNEs may function as regulatory elements at a distance >1 Mb from their target gene.38 Reporter assays have indicated that expression of a given gene may be regulated by multiple, independent enhancers active in the same tissue at the same time. For example, several discrete CNEs surrounding Sox10 were each capable of driving reporter expression in ganglia of the peripheral nervous system, in a spatially overlapping fashion.39 40 Data such as these have suggested the existence of redundancy between some enhancers, and in several cases targeted deletion of CNEs in mice has resulted in no obvious phenotype.35 41 On the other hand, as described above for the Shh locus, spontaneous mouse mutants have revealed essential functions for several regulatory domains acting at a distance, particularly during patterning of the early limb bud.42 Also, an expanding number of human diseases may be caused by disruption of long-range regulatory elements.43 In several recent examples, an association between a human developmental disorder and a point mutation in a candidate regulatory element has been demonstrated, with the substitution altering binding of a transcriptional regulator in each case.29 44 45

The SOX9 control region

Campomelic dysplasia and acampomelic campomelic dysplasia

In addition to campomelia, CD patients typically exhibit hypoplastic scapulae and pelvis, hypomineralised thoracic pedicles, a missing pair of ribs, scoliosis, talipes equinovarus (clubfeet), XY sex reversal, and PRS. The majority of mutations underlying CD have been identified within the SOX9 coding sequence, and most of these are predicted to result in loss-of-function alleles, as described above. A range of translocations, inversions and deletions that fall within a ∼1 Mb interval upstream of SOX9 have also been described, resulting in CD of varying severity.2 3 30 46 47 48 49 50 51 52 Leipoldt et al52 have classified the breakpoints into proximal and distal clusters, at 50–375 kb and 789–932 kb upstream of SOX9, respectively (fig 1). There is some correlation between the degree of campomelia and the position of the breakpoint such that most mutations within the proximal cluster result in mild to severe bowing of long bones, while all mutations within the distal cluster result in straight long bones—that is, ACD. Similarly, male sexual development appears less likely to be compromised by translocations in the distal cluster, although one patient displayed XY sex reversal.52 However, thoracic skeletal defects and PRS are still typically observed in ACD. Although the skeletal phenotype in patients with disruptions within the distal cluster appears less severe, severe complications can still occur as a result of respiratory compromise.52 In addition to the upstream breakpoint clusters described above, a breakpoint at ∼1.3 Mb downstream of SOX9 resulted in ACD with male-to-female sex reversal30 (fig 1). It has been suggested that these up- and downstream translocations result in alterations in SOX9 expression, possibly by removing long-range tissue specific regulatory elements. Accordingly, the range of phenotypes observed would result from removal of variable amounts of cis regulatory sequence. One further example from the human genetics literature supporting this model is a sex reversed ACD patient with a deletion 1.87–0.38 Mb upstream of SOX9.51 This deletion is important evidence for the loss of upstream regulatory elements because in translocation cases it is possible that SOX9 expression is compromised in ectopic chromosomal environments.51

Figure 1

Schematic diagram of the genomic environment surrounding SOX9. The upper part depicts the ∼2 Mb region upstream of SOX9, and the lower part depicts ∼2 Mb downstream. Grey boxes indicate approximate boundaries of translocation breakpoint clusters, and the number of breakpoints within each is listed in brackets. The proximal and distal clusters are described in Leipoldt et al52 and the Pierre Robin sequence (PRS) cluster is based on data from Jakobsen et al28 and Benko et al.29 Note the position of the telomeric breakpoint described by Velagaleti et al30 is approximate only. Black boxes indicate microdeletions identified by Benko et al29 in isolated PRS patients; Sp4, F1 and Sp2 (Sp denotes a de novo case and F denotes inherited). The centromeric limit of the Sp4 deletion is unknown. Enhancers driving reporter expression in transgenic mice are depicted as white ovals. The two mandibular mesenchyme enhancers were identified by Benko et al29; the more centromeric of these contains a point mutation associated with isolated PRS (F2). Note that the complete range of tissues in which these two enhancers are active has not been fully characterised. E1, E3, E7; enhancer elements characterised by Bagheri-Fam et al.58 The E1 enhancer drives expression in node, notochord, gut, bronchial epithelium and pancreas; the E3 enhancer drives expression in migrating cranial neural crest cells, branchial arch mesenchyme and the otocyst; the E7 enhancer drives expression in fore- and midbrain. The testis enhancer was identified by Sekido and Lovell-Badge.59

Tissue specific enhancers

In addition to the analysis of chromosomal rearrangements in humans, several other lines of experimental evidence support long-range regulation at the SOX9 locus. Physical association of the SOX9 locus with genomic regions ∼1.1 Mb upstream or ∼1.3 Mb downstream of SOX9 has been reported,30 suggesting the possibility of juxtaposition of chromatin domains by long-range looping, as has been demonstrated at the β-globin locus.53 54 Analysis of the Odd Sex mouse mutant has provided evidence for long-range activation of Sox9 transcription. In this mouse, insertion of a transgene containing an internal promoter 0.98 Mb upstream of Sox9 results in ectopic upregulation of Sox9 in the embryonic female gonad and female-to-male sex reversal.55 56 In vivo reporter assays have demonstrated the existence of tissue specific regulatory elements in the region surrounding SOX9. Transgenic mice harbouring 350 kb of upstream SOX9 genomic sequence linked to a lacZ reporter partly recapitulated the endogenous Sox9 expression pattern, while this was not the case in mice with only 75 kb of upstream sequence.48 Interestingly, Wunderle et al48 demonstrated that while the larger transgene reported expression in the mandible and maxilla, expression in the mandible (but not the maxilla) was lost from the smaller transgene, suggesting that regional expression of SOX9 within the branchial arches is regulated by separate control elements. Bioinformatic analysis has identified several candidate cis regulatory elements surrounding SOX9 based on conservation across vertebrate species.30 51 52 55 57 Bagheri-Fam et al58 showed that several of these small elements, derived from sequences between 290 kb upstream and 95 kb downstream of SOX9, were individually able to drive reporter gene expression in discrete embryonic sites (fig 1). One of these, an enhancer at 251 kb upstream of SOX9 (E3 in fig 1), drove lacZ expression specifically within migratory cranial neural crest cells, branchial arch mesenchyme derived from the crest, and the otic vesicle, while the addition of two further upstream enhancers appeared to increase the level of this tissue specific expression.58 These results are consistent with the idea that some regulatory elements in the SOX9 control region are tissue specific enhancers, while others are perhaps “global” enhancers that act to raise transcription levels in all cell types permissive for SOX9 expression. It has also been demonstrated that the 70 kb region immediately upstream of Sox9 is able to drive reporter expression in both cranial and trunk neural crest cells.59 These data, in combination with the cranially restricted activity of the E3 enhancer,58 suggest that Sox9 expression in early neural crest cells can be regulated by different upstream control elements at different axial levels. Despite the demonstration by Wunderle et al48 that several hundred kilobase pairs of sequence surrounding SOX9 can drive reporter expression in chondrogenic tissue in the limb buds and trunk, the precise location of discrete skeletal enhancers has remained elusive. In vitro assays have suggested that expression within cartilage may be at least partly regulated by the bone morphogenetic protein signalling pathway acting via the SOX9 proximal promoter.60


One putative regulatory domain, labelled SOX9cre1, was identified at ∼1.1 Mb upstream of SOX9 by Velagaleti et al.30 These authors claimed SOX9cre1 was conserved across several vertebrate genomes, and therefore potentially of functional significance. Recently, the characteristics of SOX9cre1 were examined in a range of in vitro assays; SOX9cre1 enhanced reporter activity in a chondrosarcoma cell line, binding of GLI1 (a transcriptional effector of the sonic hedgehog pathway) to a site within SOX9cre1 was demonstrated, and this GLI1 binding site was necessary for GLI1 induced activity of the SOX9cre1 reporter.61 However, we believe that SOX9cre1 is not a highly conserved element. At the UCSC genome browser, the Human Chained Self Alignments track suggests that the SOX9cre1 sequence is probably a pseudogene derived from the gene encoding SERPINE1 mRNA binding protein 1 (SERBP1), on chromosome 1p31.3. In the 44 species Vertebrate Multiz Alignment and Conservation track (March 2006 human genome assembly) at the UCSC genome browser, the SOX9cre1 element only appears to align with primate genomes and a few poorly annotated mammalian genomes. Although a sequence from Xenopus and fish genomes is aligned in this track, it appears to be the SERBP1 orthologue of each genome, or a SERBP1 pseudogene in a location that is not syntenic with SOX9. It remains possible that SOX9cre1 has a role in hedgehog mediated regulation of SOX9 in primates, and its disruption could therefore play a role in human disease. In primates and other vertebrates, the hedgehog pathway may influence SOX9 expression through one or more of the many other putative upstream GLI1 binding sites identified by Bien-Willner et al.61

Gonad specific regulation

Insight into the complexity of the tissue specific transcriptional regulation existing at the SOX9 locus comes from a recent study in which an enhancer identified ∼10 kb upstream of Sox9 was able to drive reporter expression in the male but not the female gonad of transgenic mice (hence recapitulating endogenous Sox9 expression)59 (fig 1). The existence of a testis specific enhancer had been speculated, given the XY sex reversal displayed by some patients with upstream translocations, but had been difficult to pinpoint. The enhancer was shown to bind Sry (the male sex determination factor), Sf1 (a factor required for gonad development), and Sox9 itself, with each protein occupying multiple sites on the enhancer in vivo. The data suggested that Sox9 transcription is dependent on a range of inputs that vary during gonad development, with Sry and Sf1 acting during early phases, followed by maintenance of expression via Sox9 acting in a positive feedback loop.59 Surprisingly, the translocation breakpoint cases associated with male sex reversal in the SOX9 upstream region would leave this testis specific enhancer intact. Indeed, the most distant sex reversed translocations are 789 kb upstream and ∼1.3 Mb downstream.30 52 It is therefore possible that one or more global enhancers positioned farther upstream are required to maintain sufficient levels of transcription in concert with the testis specific element, or that multiple testis specific enhancers distributed far apart act in synergy.

Pierre Robin sequence

One of the most consistent features of both CD and ACD is PRS, a craniofacial anomaly consisting of a hypoplastic or retropositioned mandible, cleft secondary palate and glossoptosis (retropositioned tongue) leading to obstructive apnoea and feeding difficulties. Feeding and respiratory complications frequently continue well into the first year of life, impacting on normal growth and development and requiring significant medical and surgical intervention. Although the causes of PRS are heterogeneous,62 the primary defect in a proportion of PRS cases is likely to be mandibular hypoplasia before palatal closure, leading to physical obstruction of the paired palatal shelves by the posteriorly placed tongue, and subsequent failure of palatal fusion. During embryogenesis, the first branchial arch is populated by mid- and hindbrain derived cranial neural crest cells, from which the skeletal elements of the mandible and maxilla develop. Genetic or environmental insults that impact on neural crest cell induction, migration or proliferation, or on subsequent chondrogenesis within the developing mandible, are therefore potential causes of the PRS phenotype.

A review of the literature and cytogenetic databases suggested that PRS can be associated with chromosomal anomalies at 2q24.1–33.3, 4q32–qter, 11q21–23.1 and 17q21–24.3.63 In most of these cases PRS is associated with other defects. Other than CD/ACD, some syndromes in which PRS is consistently a feature include Stickler syndrome and related collagenopathies, Treacher Collins syndrome, and velocardiofacial syndrome.62 64 65 66 Few reports have linked isolated (that is, non-syndromic) PRS with a specific genetic lesion. An unbalanced t(2;21) translocation with a 2q32.3–33.2 deletion has been identified in a patient with PRS plus mild facial dysmorphology.67 In addition, mutations in the COL2A1, COL11A1 and COL11A2 genes, which encode components of the cartilage matrix, have been associated with both isolated PRS and syndromic collagenopathies.68

Isolated PRS at the SOX9 locus

Several isolated PRS cases have previously been mapped to 17q23–2463 69 70 suggesting the existence of a PRS locus in that region. Jakobsen et al28 have fine-mapped a balanced translocation to within 5 kb of a site 1.13 Mb upstream of SOX9 on 17q24.3. Recently, translocation breakpoints at 1.23, ∼1.18 and ∼1.07 Mb centromeric to SOX9 that segregate with non-syndromic PRS in three families have been identified.29 These data establish an extraordinary clustering of breakpoints within a ∼160 kb domain (PRS breakpoint cluster in fig 1) in the 1.9 Mb gene desert between SOX9 and the nearest centromeric gene, KCNJ2. Array comparative genomic hybridisation (CGH) has previously been successful for the identification of large deletions in the region upstream of or including SOX9.51 71 Benko et al29 employed high density array CGH to screen non-syndromic PRS patients with normal karyotype for microdeletions in the 17q24.3 region. Of 12 unrelated patients examined, three harboured heterozygous deletions: two greater than 1.38 Mb centromeric to SOX9 (Sp4 and F1 in fig 1; >319 kb and 75 kb deleted, respectively), and one 1.52 Mb telomeric to SOX9 (Sp2 in fig 1; 36 kb deleted). In a fourth patient (of the 12), a non-polymorphic point mutation (F2 in fig 1) was detected by sequencing candidate evolutionarily conserved segments from within the region deleted in the F1 case. Collectively, these findings strongly implicate a disruption of the genomic region surrounding SOX9 as a frequent cause of non-syndromic PRS.

With the hypothesis that the lesions disrupt one or more DNA elements required for long-range regulation of SOX9 transcription in tissues affected in PRS, Benko et al29 tested the in vivo enhancer activity of two separate conserved elements (fig 1) from the region upstream of the clustered breakpoints. One of these CNEs was the wildtype version of the element mutated in family F2. Indeed, each segment tested was active within the mandible of transgenic mice. Chromatin modifications over an extended range surrounding the Sox9 locus, specifically in mandibular cells expressing Sox9 in vivo, were also demonstrated by interphase fluorescence in situ hybridisation (FISH).29 Upon conditional deletion of Sox9 in cranial neural crest cells, mice exhibit cleft palate and a complete failure of chondrogenesis of the craniofacial elements normally derived from the crest.72 Mice heterozygous for Sox9 in all tissues or specifically in the cranial neural crest also develop cleft palate and mandibular hypoplasia, indicating that the neural crest or the cartilage derived from it is sensitive to the dose of Sox9.13 72 73 Also, Sox9 overexpression in the cranial neural crest in chick embryos induces over-production of cartilage in branchial arch derivatives,74 while overexpression of Sox9 in chondrocytes in mice results in a phenotype that includes cleft palate and a short snout,75 arguing that Sox9 expression levels must be tightly regulated for normal mandibular development. In combination with these studies in animal models, the finding of distant mandibular enhancers that are removed from the SOX9 locus in isolated PRS suggests that the human phenotype may be caused by SOX9 haploinsufficiency within chondrogenic mesenchyme of the first branchial arch. In vivo and in vitro reporter assays have suggested that the collagen genes COL2A1 and COL11A2, mutations in which are associated with isolated PRS and Stickler syndrome, are direct targets of SOX9 transcriptional activation.76 77 78 79 80 Therefore, mutations at multiple levels of the chondrogenic pathway may result in PRS.

Interestingly, the developmental transcription factor MSX1 demonstrated enhanced binding to the F2 family point mutation, relative to the wildtype sequence, in vitro.29 MSX1 is a good candidate as a regulator of SOX9 expression in the craniofacial primordia, given that cleft palate and mandibular reduction are phenotypes of Msx1-null mice,81 and that MSX1 is mutated in humans with orofacial clefting.82 In mandible derived cells in vitro, the wildtype version of the F2 CNE activated transcription of a reporter, while the mutated CNE did not. Given that Msx1 typically acts as a repressor of transcription in vitro, these results raise the possibility that PRS in the affected family is due to increased repression of SOX9 by MSX1. In many embryonic contexts, Msx1 is expressed in a similar fashion to its paralogue, Msx2, and each can bind the same DNA sequence.83 84 Notably, overexpression experiments have suggested an antagonistic relationship between Msx2 and Sox9 during mandibular chondrogenesis, whereby Msx2 regionally represses cartilage development.85 86 Also, Msx factors are closely related to the Dlx homeodomain proteins, which themselves are required for mandibular development.87 Dlx and Msx factors can compete for, or sequentially occupy, regulatory elements in vitro.84 88 Given their partly overlapping expression patterns within the branchial arches, an intriguing possibility is that during normal development, the balance between DLX and MSX factors within mandibular mesenchyme influences expression of SOX9 via the CNE identified in the F2 family. The chondrogenic domain within the mandible may ultimately be defined by many pathways converging on the SOX9 control region.

Despite the demonstration of enhancers upstream of the PRS breakpoint cluster capable of driving expression in the mandible, these results do not preclude the possibility of dysregulation of SOX9 expression in other sites and stages contributing to the PRS phenotype. Given the role for Sox9 in early neural crest cell development, a deficit in SOX9 expression during this stage may result in a defect similar to that underlying Treacher Collins syndrome. In this disorder, mutation of the TCOF1 gene results in various craniofacial defects including PRS, and loss-of-function studies in mice have demonstrated that Tcof1 is required for production and proliferation of cranial neural crest cells at a stage before skeletal differentiation.89 Also, Sox9 expression has been detected within the subepithelial mesenchyme of the palatal shelves before and during fusion,90 91 raising the possibility of a morphogenetic role in this tissue (distinct from that during skeletogenesis), which may be disrupted in PRS. Further characterisation of enhancer elements centromeric to the PRS breakpoint cluster, using reporter assays in animal models, may shed light on the range of sites in which SOX9 dysregulation could result in PRS.

Although the evidence presented above strongly supports dysregulation of SOX9 as a cause of isolated PRS, it should be noted that altered expression of other genes in the 17q24.3 region may play a role in the aetiology of the disorder. Eight genes reside between SOX9 and the Sp2 deletion at 1.52 Mb downstream (fig 1), and with no functional information available for several of these, their possible dysregulation as a factor contributing to the phenotype cannot be discounted. The closest gene on the centromeric side of the PRS breakpoint cluster is KCNJ2, coding for an inward rectifying potassium channel that has a role in maintaining resting membrane potential in skeletal and cardiac muscle.92 Coding sequence mutations in KCNJ2 are responsible for Andersen syndrome (OMIM 170390), which involves cardiac arrhythmias, periodic paralysis and dysmorphic features. The latter, of unknown aetiology, can include micrognathia, and rarely, cleft palate.93 94 95 96 Interestingly, homozygous (but not heterozygous) deletion of Kcnj2 in mice results in a cleft secondary palate.97 However, mutations associated with Andersen syndrome typically act in a dominant negative manner.98 It is therefore unclear whether distant genomic lesions influencing KCNJ2 expression levels could result in isolated PRS. Comparison of KCNJ2 and SOX9 mRNA levels indicated reduced expression of both genes associated with non-syndromic PRS.28 The significance of these data is unclear since the sample size was very small and the analysis was carried out on cultured lymphoblasts, a cell type not known to be affected in PRS. Removal of a putative enhancer region in the Kcnj2-Sox9 interval by targeted deletion in mice would potentially provide a model of human PRS, and examination of resultant gene expression changes may shed light on whether SOX9 dysregulation alone is responsible for PRS.

The CD-ACD-PRS continuum

The proximal and distal clusters of breakpoints at 50–375 kb and 789–932 kb upstream of SOX9 established the concept that the distance of the genomic lesion from the transcription start site could correlate to some degree with the severity of the resulting phenotype. Clearly, there are exceptions to this trend—for example, ∼10% of SOX9 coding sequence mutations result in ACD, and poor clinical outcomes were reported for some ACD patients in the distal breakpoint cluster.52 Modifying alleles or environmental factors may impact on the phenotype displayed for a given lesion at the SOX9 locus. However, the identification of translocations and deletions greater than 1 Mb upstream in isolated PRS cases now suggests the existence of a third grouping of lesions, even further upstream from SOX9, that correlates with a particular phenotype. In contrast to the features of ACD frequently observed in the previously most distal breakpoint cluster—that is, PRS plus axial skeletal defects—the third group of lesions defined here is associated with PRS only. The isolated PRS in these cases would therefore represent the mildest phenotype of the spectrum of disorders assumed to be caused by transcriptional dysregulation of SOX9. Given this spectrum, including CD, ACD and now PRS, where the relationship between these disorders is not immediately obvious, we propose here a more inclusive term; the SOX9 Spectrum Disorders (SSDs). This term applies to all syndromes caused by lesions at the SOX9 locus, whether influencing gene expression or protein function.

One possibility following the discovery of isolated PRS lesions >1 Mb upstream of SOX9 is the existence of a regulatory domain that is required for correct development of the skeletal structures typically affected in ACD but not in isolated PRS (that is, scapulae, ribs, vertebrae). This putative regulatory element(s) would reside between the isolated PRS breakpoint cluster and the ACD associated distal breakpoint cluster, and function independently of the putative craniofacial regulatory domain further upstream than 1.23 Mb that is disrupted in the isolated PRS cases. A recent study of regulatory elements controlling expression of Bmp5, a signalling molecule involved in cartilage morphogenesis, indicated that an extraordinary degree of spatial control of transcription is possible within the developing skeleton, with different enhancers from the Bmp5 locus promoting reporter expression in different subdomains of the same skeletal element.99 It is conceivable that SOX9 expression in a particular cell population is dependent on a series of enhancers dedicated to that tissue, spread throughout the SOX9 control region. Perhaps PRS is the only feature to result from the distant lesions because of a higher density of craniofacial enhancers, compared to enhancers for other tissues, in the region removed.

Parallels may exist between the complexity of the regulatory mechanisms at the SOX9 locus and those coordinating expression of the related genes Sox10 and Sox2,39 40 100 where reporter assays have revealed spatially overlapping enhancer activity. This may be an indication of redundancy in enhancer function, or may suggest that a series of enhancers with similar activity can have an additive effect on transcription. Indeed, it is clear that craniofacial enhancers are not uniquely located further upstream than ∼1.2 Mb; the existence of several within a region <350 kb upstream has already been demonstrated or implied.48 58 59 One possibility is that some structures require a higher threshold level of summed SOX9 enhancer function in order to develop correctly. For example, the mandible may be particularly sensitive to SOX9 expression levels, perhaps requiring the full complement of enhancers active in that tissue. In this regard, it is intriguing that some mutations in cartilage specific collagen genes have been associated with isolated PRS, while other mutations in the same genes result in more widespread chondrodysplasias. This may be an indication of a higher sensitivity of craniofacial structures to defects in chondrogenesis. If this were true, then removal of one or more global enhancers (that is, non-tissue specific) from the SOX9 control region by the distant lesions could result in a modest, generalised reduction in SOX9 expression sufficient to impact on development of craniofacial cartilage only and therefore produce an apparently tissue specific defect. While this is a purely theoretical consideration at present, it may be important for future detailed analysis of the SOX9 regulatory environment.

Finally, two lesions far downstream of SOX9 have now been described: a deletion at 1.52 Mb resulting in isolated PRS,29 and a translocation at ∼1.3 Mb resulting in ACD, XY sex reversal and PRS30 (fig 1). Although in the latter case a complex karyotype was described, these findings raise the possibility of further tissue specific, long-range regulatory elements telomeric to SOX9; however, these would exist in a more gene-rich environment than the centromeric elements. How these elements would influence expression of SOX9 but not the intervening genes remains an intriguing question; it is possible that insulators protect the intervening loci.


Our understanding of the complexity of the SOX9 regulatory region is expanding and this growth has been predominantly fuelled by identification of novel genomic lesions in individuals with a range of related disorders. The regulatory domain surrounding SOX9 may span >3 Mb with multiple enhancers fine tuning expression during development. Clearly, the size and complexity of this domain presents a major challenge for mapping small scale mutations in regulatory elements that may underlie human disorders associated with the locus. Although conservation across species has been a major criterion when defining candidate regulatory elements, a further complication may be the existence of functional elements with limited conservation at the sequence level, as has been described at the RET locus.101 However, the recent identification of microdeletions surrounding SOX9 suggests that screening of this interval using tools such as high density array CGH may be fruitful in other non-translocated, isolated PRS patients. Also, mapping the full range of craniofacial regulatory elements at the SOX9 locus by testing reporter constructs in an animal model may yield further candidate elements for sequencing in these patients. Although the identification of translocation breakpoints has been instrumental in our appreciation of the extent of the SOX9 control region, the ability to search for small scale deletions in PRS, ACD or CD patients may herald a new phase in our understanding of the locus, with the potential for discovery of novel cis regulatory elements, such as those predicted to drive expression in cartilage of the limbs. Study of the SOX9 regulatory environment has produced a number of insights into the control of mammalian gene expression and, given the complexity uncovered thus far, is likely to provide many more surprises as detailed analysis continues in the coming years.



  • Funding PGF is supported by NHMRC grants 284522 and 491229 and an NHMRC research fellowship. SL is supported by the ANR (CraniRare grant) and the Fondation pour la Recherche Médicale (FRM).

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.