Article Text
Abstract
BACKGROUND AND AIMS Genetic predisposition for inflammatory bowel disease (IBD) has been demonstrated by epidemiological and genetic linkage studies. Genetic linkage of IBD to chromosome 3 has been observed previously. A high density analysis of chromosome 3p was performed to confirm prior linkages and elucidate potential genetic associations.
METHODS Forty three microsatellite markers on chromosome 3 were genotyped in 353 affected sibling pairs of North European Caucasian extraction (average marker density 2 cM in the linkage interval). Marker order was defined by genetic and radiation hybrid techniques.
RESULTS The maximum single point logarithm of odds (LOD) score was observed for Crohn's disease at D3S3591. Peak multipoint LOD scores of 1.65 and 1.40 for the IBD phenotype were observed near D3S1304 (distal 3p) and near D3S1283 in the linkage region previously reported. Crohn's disease contributed predominantly to the linkage. The transmission disequilibrium test showed significant evidence of association (p=0.009) between allele 4 of D3S1076 and the IBD phenotype (51 transmittedv 28 non-transmitted). Two known polymorphisms in the CCR2 and CCR5 genes were analysed, neither of which showed significant association with IBD. Additional haplotype associations were observed in the vicinity of D3S1076.
CONCLUSIONS This study provides confirmatory linkage evidence for an IBD susceptibility locus on chromosome 3p and suggests that CCR2 and CCR5 are unlikely to be major susceptibility loci for IBD. The association findings in this region warrant further investigation.
- inflammatory bowel disease
- fine mapping
- chromosome 3
Abbreviations used in this paper
- IBD
- inflammatory bowel disease
- CD
- Crohn's disease
- UC
- ulcerative colitis
- HIV
- human immunodeficiency virus
- PCR
- polymerase chain reaction
- LOD
- logarithm of odds
- TDT
- transmission disequilibrium test
- IL5RA
- interleukin 5 receptor alpha
Statistics from Altmetric.com
A genetic component in the aetiology of inflammatory bowel disease (IBD) has been clearly demonstrated by epidemiological and genetic linkage studies. Epidemiological investigations have consistently shown familial clustering1 and an increased concordance of the IBD phenotype in monozygotic twins.2 ,3Prevalence rates of IBD between ethnic and geographic isolates differ significantly. This is possibly due to a combination of different genetic and environmental factors operating in different populations. A genetic component underlying a portion of these differences is indicated by prevalence differences between ethnic groups living in the same geographic region, for example the 2–8-fold higher prevalence of IBD observed in Ashkenazi Jews versus non-Jews.4 Linkage and association analyses for several IBD predisposition regions have produced different results in independent family samples.5-7 The variance in linkage results may be attributed to several factors including: (1) differences in the genetic causation of IBD (that is, multiple disease genes); (2) use of different diagnostic criteria; (3) inability to recruit sufficient numbers of patients in some populations; (4) methodological limitations of linkage and association analyses for complex genetic disorders8; and (5) use of different genetic markers. To determine the significance of a given linkage result, replication of a genetic linkage observation using large independent sample sets is required. Several groups investigating the genetic components of IBD have recently completed thorough genome wide linkage analyses in large patient cohorts.5 ,6 ,9 ,10 These studies have defined three major IBD susceptibility loci located on chromosomes 6, 12, and 16. These findings are also supported by replication studies in a number of independent patient collections.5-7 11-17
In addition to the well supported linkages to chromosomes 6, 12, and 16, a number of interesting secondary linkages have been defined.5 ,9 ,10 The linkage region on chromosome 3 was initially identified by Satsangi and colleagues5 who established linkage with supporting p value statistics of 0.0026 and 0.00021 in primary and follow on family sets. The minimum p value of 0.00021 corresponds to a logarithm of odds (LOD) score of 3.4 (as transformed according to the equation: χ2=LOD×2 ln(10)).18 This region encompasses the proximal segment of chromosome 3p and is defined by markers D3S1076 and D31573. Subsequent linkage studies have not supported the chromosome 3 linkage. A recent study in 161 IBD families of Canadian descent yielded LOD scores of 0.07–0.25.19 A second study in 58 families of Italian descent also failed to show evidence of linkage in this region.16 The genome wide study by Choet al provided a peak multipoint LOD score of 1.0 in this region.9
In our previous genome wide linkage study, we reported a suggestive multipoint LOD score of 1.2 for this susceptibility region.10 The variable LOD scores observed for this interval are not unexpected for a complex disorder such as IBD8 ,20 and may be due to a reduced penetrance of this locus relative to others and methodological issues of non-parametric linkage analyses. Despite disparate genetic results for linkage of IBD to chromosome 3,16 ,19 the region is of significant interest. Two autoimmune disorders, multiple sclerosis and inflammatory arthritis, have been genetically linked to chromosome 3p. These independent observations suggest that a gene, or perhaps multiple genes, involved in regulating immune function and inflammatory response reside in this region of the genome. Additional interest in the 3p region stems from the observation that numerous candidate genes, including the chemokine receptor cluster and interleukin 5 receptor α, are located on chromosome 3p.21-23
We have therefore performed a high resolution linkage and association study on chromosome 3p. Our goal was to better define the significance of the chromosome 3p locus and to directly test several of the candidate genes from the region of interest for involvement in IBD. Highly polymorphic microsatellite markers saturating chromosome 3p with an average spacing of 2 cM were genotyped in our large family cohort. Genetic linkage and a systematic association analysis using both single point and two point transmission disequilibrium test (TDT) statistics was then performed. In the vicinity of the single most significant association result, we selected two candidate genes and tested known variations that were previously associated with resistance to human immunodeficiency virus (HIV) infection.21 ,22 We demonstrate that these variants are not likely to be involved in susceptibility to IBD and suggest a more proximal localisation of the chromosome 3p susceptibility gene for IBD.
Materials and methods
FAMILY ASCERTAINMENT AND PHENOTYPES
The population studied here has been described previously.10 ,12 ,15 ,24 ,25 The cohort was recruited in Europe by several cooperative centres including: Charite University Hospital (Berlin, Germany), I Department of Medicine at the Christian-Albrechts-Universtität (Kiel, Germany), King's College School of Medicine, Guy's Hospital, and St Mark's Hospital (London, UK), AMC (Amsterdam, the Netherlands). Forty six percent of families were of German origin, 6% were from the Netherlands, and 48% of families were recruited in the UK. All study participants gave informed written consent. The recruitment protocols were approved prior to study initiation by the respective institutional review boards. The diagnosis of IBD and classification into Crohn's disease (CD) and ulcerative colitis (UC) were determined by standard diagnostic criteria.26 ,27 Ascertainment criteria were determined prior to the initiation of patient collection. Medical records for all patients were reviewed by one or more of the principal investigators in the families originating from the UK and the Netherlands. For families of German extraction, patients were directly examined by one or more of the principal investigators, where possible. Alternatively, two written records containing a detailed disease history and results of all diagnostic procedures were obtained for each patient and reviewed by the principal investigators. A venous blood sample was obtained from the affected siblings and their parents if possible. An overview of the family cohort is given in table1.
GENOTYPING
Genomic DNA was prepared from whole venous blood using the Puregene system (Gentra Systems, Minneapolis, Minnesota, USA). Blood samples were stored at room temperature for up to one week or frozen and stored at −70°C for up to nine months prior to purification. Forty three polymorphic microsatellite markers covering chromosome 3 were genotyped using polymerase chain reaction (PCR) with fluorescent labelled primers, as described previously.28 The primer sequences were derived from GDB or the literature.21 ,23In brief, individual DNA samples were arrayed in 96 well microtitre plates and amplified by PCR with the respective primers. Product length of the PCR products was determined by electrophoresis on denaturing polyacrylamide gels using ABI 377 automated DNA sequencers. Data were collected using the PE Applied Biosystems (Foster City, California, USA) Genescan software. Allele analyses and individual allele calling were performed as described previously.28 ,29Genotype errors as a result of non-mendelian segregation in pedigrees were detected and corrected as described by Hall and Nanthakumar.28
Variants in CCR2 were genotyped by PCR-RFLP using the primers in table 2. Amplification generated a 128 bp product, which was digested into 110 and 18 bp fragments with Bsa BIwhen isoleucine was substituted for valine at position 64 in the CCR2 gene. The CCR5 site was amplified with the primers in table 2 and the presence or absence of the 32 bp deletion was directly scored on 4% agarose gel.
Marker order and distance separating each marker were defined by information derived from our pedigrees using the automated mapping program Multimap v 2.0.30 The maps were confirmed by comparison with published genetic maps (CHLC and Genethon athttp://www.ncbi.nlm.nih.gov). Placement of previously unlocalised markers was confirmed by radiation hybrid analysis using the Genebridge 3 radiation hybrid (Research Genetics, Huntsville, Alabama, USA) panel. RH vectors were analysed using the Stanford RH mapping server (athttp://www-shgc.stanford.edu/) and by using the Radmap extension of the Multimap program.
STATISTICAL ANALYSIS
Genetic analyses were conducted using the two IBD subphenotypes CD and UC. A third category, “ALL”, contained CD/CD, UC/UC, and CD/UC (mixed) affected sibling pairs. The ALL category therefore represents IBD as a single phenotype for analysis. Allele frequencies for each marker were calculated from the cohort genotype data using all individuals. Data were analysed with both single point and multipoint non-parametric allele sharing tests in affected sibling pairs using Mapmaker/Sibs with the “weighted pairs” option.31 For multipoint analysis, LOD scores were computed at 1 cM intervals along the chromosome. Mean information content across chromosome 3p was 93%.
Association statistics were calculated using the TDT (for single point TDT tests) and TDT2 (for two marker TDT tests) functions of the Genehunter 2.0 program.32 ,33 TDT provides a statistical measure of genetic association and linkage that is robust against population stratification, because it scores transmission events from parents to offspring. Data derived from less than 10 observed transmission events were excluded from the results to reduce the possibility of false positive findings. The program performs a classical TDT test using only heterozygote founders.32This algorithm provides no correction for the testing of multiple alleles.
Results
A total of 43 microsatellite markers covering chromosome 3 were analysed for the study. Twenty of the markers—providing a general framework—were used in the genome wide linkage analysis10and 23 additional saturation markers on chromosome 3p were added. Based on previously observed linkage of IBD to chromosome 3, linkage of other autoimmune disorders to chromosome 3p, and the presence of several interesting candidate genes in this region, a very high marker density was analysed in this area. The average marker density in the extended susceptibility region from D3S1304 to D3S1289 was 2 cM after estimation of the marker distances from the dataset using Multimap.
Results from single point analysis are shown in table 3. The peak single point LOD score was obtained at D3S3591 in the CD disease category. On multipoint analysis, a peak multipoint LOD score of 1.65 near marker D3S1304 was observed for the combined IBD phenotype (ALL category). In the linkage region previously described by Satsangiet al, a peak multipoint LOD score of 1.4 was obtained, also for the IBD phenotype. The linkage at D3S1304 was almost exclusively derived from the CD phenotype. This was demonstrated by the single point LOD score of 1.3 obtained in the CD category at D3S3591 versus the 0.90 single point LOD obtained in the ALL category. For linkage in the previously implicated region on the proximal p arm of chromosome 3, a more equal contribution of the CD and UC phenotypes, with CD still being predominant, was observed. The multipoint linkage curve is shown in fig 1. Although there are several regions that are suggestive of genetic linkage, none of these regions meet strict criteria for defining a genetic locus34 de novo.
The primary goals of this high density genotyping experiment were to better define the 3p susceptibility region and to attempt identification of disease related linkage disequilibrium. To facilitate the latter goal, a systematic single point TDT analysis using only heterozygous parents was performed for all markers. The nominal TDT results, with the most significant p value from all alleles and subphenotypes, is presented in table 3. The most significant nominal p value (p=0.009) was observed for allele 4 at D3S1076 with 51 transmitted alleles versus 28 non-transmitted alleles. D3S1076 is located near the chemokine receptor 2 (CCR2) and chemokine receptor 5 (CCR5) genes. Based on the TDT result, these genes were selected for direct analysis as potential IBD susceptibility genes. Two known variants, the I64V polymorphism in CCR2 and a 32 bp deletion in CCR5, were used for this analysis. Allele frequencies observed in our cohort were similar to data previously reported at 8.3% for CCR2 and 12% for CCR5.21 These markers were used for both linkage and association analysis. The linkage results are shown in table 3 and fig1. TDT analysis of the CCR2 variant revealed a p value of 0.26 (46 transmissions versus 36 non-transmissions) in the ALL category and 0.13 (32 versus 21 transmissions) for the CD phenotype. Analysis of the CCR5 deletion yielded values of p=0.15 (64 transmissions versus 49 non-transmissions) for the ALL category and p=0.32 (37 versus 29 transmissions) in the CD category.
To obtain a more sensitive measure of association and increase the possibility of identifying less prevalent but important associated haplotypes, a systematic two locus association analysis was performed across the region. The association results with higher significance levels than 0.01 are given in table 3 together with the disease category from which they were obtained. Nominal p values as low as p=0.0009 (in the ALL category) were recorded. Interestingly, two clusters of association results were identified around D3S1076 and, to a lesser degree, around D3S2337. The cluster around D3S1076 contained nominal association values of p=0.004 for the CCR2-CCR5 haplotype in CD, p=0.003 for the CCR5-D3S1086, and a p value of 0.0009 for the D3S1298-D3S1300 haplotype in the ALL category.
The interleukin 5 receptor alpha (IL5RA) gene is located on chromosome 3p in an area of suggestive linkage defined by our initial genome wide scan. This gene is involved in immunoregulation and is an interesting candidate for IBD. To investigate this candidate gene we genotyped an intragenic mirosatellite23 marker. No evidence of association (corrected p value >0.10) was detected. Hence IL5RA can also be excluded as a major risk determinant of IBD.
Discussion
We have reported here supporting evidence for the existence of an IBD susceptibility gene on chromosome 3p, described the results of genetic investigations for several important candidate genes, and provided a systematic high density association analysis of the chromosome 3p linkage region.
A multipoint LOD score of 1.4 in the chromosome 3 interval previously identified by Satsangi and colleagues5 was observed in the analysis of the ALL phenotype category. The original chromosome 3 linkage findings were driven predominantly by the CD subphenotype. In our dataset, a similar relationship regarding the contribution of CD and UC to the proximal chromosome 3p linkage was observed. We noted however that our family cohort contained an approximately 2:1 ratio of CD to UC patients. A second linkage peak on the long arm of chromosome 3 has recently been reported by Cho et al. We observed a multipoint LOD score of 1.0 in this region driven by analysis of the CD phenotype. The importance of this interval will be determined by further study.
There was a firm initial linkage finding for the proximal p arm with a p value of 0.00021 (corresponding to a LOD score of 3.4).18 The genome wide analyses of Choet al produced a LOD score of 1.0, slightly distal to the Satsangi et al linkage. Additional studies in Canadian and Italian populations have failed to support linkage16 ,19 to chromosome 3. Our data provide additional support for an IBD susceptibility gene in this genomic region. In total, these observations suggest that the chromosome 3p locus apparently has a smaller relative risk, compared with for instance the widely replicated linkage region on chromosome 16, although this locus also only accounts for a portion of the relative risk. Alternatively, the influence of the chromosome 3p locus may be significantly affected by interaction with other susceptibility genes which therefore limits the power of linkage studies. The final differentiation of these possibilities has to await the identification of the causative molecular variants that confer IBD susceptibility in this and other regions.
The primary objective driving the high density genotyping described here was the identification of disease associated disequilibrium. The methodology best suited to the identification of complex disease genes and the populations that may lead to the identification of the relevant genes are not yet clear.35 ,36 At this point, the use of single point and multipoint TDT testing seems to be a practical and robust way to develop an understanding of disease associated disequilibrium for a particular disease. The single point associations test yielded nominal p values to 0.009 (p value corrected for the presence of multiple alleles as given by TDTLIKE,37p=0.07). Given the number of tests performed, this result should not be over interpreted. Based on the assumption that possible founder haplotypes will be relatively rare in a modern admixed population, the fact that no significant single marker (that is, corrected p value <0.01) was identified is not unexpected. Therefore, two locus TDT analysis was performed for all markers. This approach will potentially identify more rare, but significant, haplotypes which cannot be discerned on the basis of a single polymorphic marker. Interestingly, the association findings having nominal p values <0.01 cluster at two points along the map: in the vicinity of D3S1076, and to a lesser degree, near D3S2337. We have observed association “signals” that point to this genetic region but we are not defining a single associated haplotype. This may be caused by: (i) an incorrect marker order which may occur for very closely spaced markers or (ii) the presence of multiple associated haplotypes in the population. The question of multiple testing biasing this result is clearly at issue but these findings may be used to refine the location of the chromosome 3p IBD locus and will facilitate direct candidate gene investigation and additional genetic experiments.
The chemokine receptors 2 and 5 are located in close proximity to the single marker association at D3S1076. Based on the suggestive association evidence and their functional importance in immunoregulation, these genes were chosen for direct investigation. In each of these genes, functional variants associated with resistance to HIV infection22 or disease progression (CCR2)21 have been identified. We investigated the putative role of CCR2 and CCR5 in IBD using these variants as polymorphic markers. Using TDT analysis in the 253 families with IBD, no significant influence of these mutations on the IBD phenotype was detected. Formal exclusion of CCR2 and CCR5 is not possible using the TDT method. The region around D3S1076 contains a number of suggestive association “signals” proximal to the chemokine receptor genes. Figure 2 describes the region in which the CCR2 and CCR5 genes are localised. Both reside on a fully sequenced BAC clone (Genbank recordU95626) and a number of highly interesting candidate genes are located in their immediate vicinity. These include the lactotransferrin gene (fig 2), which is suspected to have a role in neutrophil function and bacterial defence, several genes of the ubiquitin complex (USP4, UQCRC1) which may play a critical role in antigen processing,38 the cathelicidin antimicrobial peptide39 and the TRAF interacting protein, which is important for tumour necrosis factor α signal transduction,40 the mitogen activated protein kinase activated protein kinase 3,41 and the interferon α receptor 2.42 This region is extremely gene rich and harbours more than 200 transcripts within 3 cM of the chromosome (GeneMap99; http://www.ncbi.nlm.nih.gov/genemap99/). Given the linkage and association data implicating this region, it is possible that any one of these genes could have a role in IBD. Systematic investigation will be required to define the gene, or genes, from this region which are involved in modifying the IBD phenotype.
In summary, we have presented confirmatory linkage evidence for the existence of an IBD susceptibility gene on the proximal part of chromosome 3p. Neither CCR2, CCR5, nor IL5RA are likely to represent the IBD susceptibility gene in this region. Association results obtained in this dense microsatellite mapping experiment may facilitate identification of the risk gene located on chromosome 3p. Using a transcript based mapping approach, genes proximal to the CCR2 and CCR5 genes can be prioritised and investigated directly for involvement in IBD.
Acknowledgments
The authors thank the physicians, IBD patients, and their families for participating in this study. The cooperation of the German Crohn's and Colitis Foundation (DCCV e V), Professor Raedler/Hamburg, Professor Kruis/Köln, Dr Theuer/Heilbronn, Dr Meckler/Gedern, Professor Lochs, Dr Wedel, T. Herrmann/Berlin, Dr Herchenbach/Recklinghausen, Professor Scheurlen/Würzburg, Dr Demharter/Augsburg, Dr Simon/Munich, Dr Purrmann/Moers, Dr Jessen/Kiel, Dr Zehnter/Dortmund, Dr Lübke, Dr Weismüller/Koblenz, Dr Eiche/Denkendorf, Dr Schönfelder/Aachen, Professor Fleig/Halle all in Germany, Dr Wewalka/Linz, Dr Knofloch/Wels, both in Austria and Dr Hodgson, Dr Sanderson, Dr Pounder, Dr Forbes, Dr Forgacs/London, Dr Bird/Maidstone, Dr Hines/Haywards Heath, Dr Cairns, Dr Ireland/Brighton, Dr Barrison/St Albans and Dr Smith-Lang/Sidcup in the UK, are gratefully acknowledged. The authors acknowledge the great contribution of Dr JCW Lee in the collection of patients and the expert technical efforts of Jonalyn Matusalem, Larenia Pedriguez, Hye Jin Yang, Birte Köpke, Brigitte Mauracher, Tam Ho Kim, and Kirstin Schirrmacher and the expert Macintosh support by Carl Manaster. This work was supported by Axys Pharmaceuticals Inc, the National Association for Colitis and Crohn's disease (UK), Crohn's in Childhood Research Association (UK), the Sir Halley Stewart Trust, the Deutsche Forschungsgemeinschaft (Schr 512/5–1, SFB 415), a Training and Mobility of Research (TMR) Network grant of the European Union (ERB-4061-PL-97–0389), by MFG, and a MedNet “Chronisch-entzündliche Darmerkrankungen” of the German Federal Department for Research and Education (BmBF).
Abbreviations used in this paper
- IBD
- inflammatory bowel disease
- CD
- Crohn's disease
- UC
- ulcerative colitis
- HIV
- human immunodeficiency virus
- PCR
- polymerase chain reaction
- LOD
- logarithm of odds
- TDT
- transmission disequilibrium test
- IL5RA
- interleukin 5 receptor alpha