Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Complex SNP-related sequence variation in segmental genome duplications

Abstract

There is uncertainty about the true nature of predicted single-nucleotide polymorphisms (SNPs) in segmental duplications (duplicons) and whether these markers genuinely exist at increased density as indicated in public databases. We explored these issues by genotyping 157 predicted SNPs in duplicons and control regions in normal diploid genomes and fully homozygous complete hydatidiform moles. Our data identified many true SNPs in duplicon regions and few paralogous sequence variants. Twenty-eight percent of the polymorphic duplicon sequences we tested involved multisite variation, a new type of polymorphism representing the sum of the signals from many individual duplicon copies that vary in sequence content due to duplication, deletion or gene conversion. Multisite variations can masquerade as normal SNPs when genotyped. Given that duplicons comprise at least 5% of the genome and many are yet to be annotated in the genome draft, effective strategies to identify multisite variation must be established and deployed.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Genotyping patterns identifying evolutionary sequence states.
Figure 2: Summarized genotyping results.
Figure 3: MLPA data for eight CHMs across three consecutive loci.

Similar content being viewed by others

References

  1. Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

    Article  CAS  Google Scholar 

  2. Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  CAS  Google Scholar 

  3. Bailey, J.A. et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002).

    Article  CAS  Google Scholar 

  4. Bailey, J.A., Yavor, A.M., Massa, H.F., Trask, B.J. & Eichler, E.E. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 11, 1005–1017 (2001).

    Article  CAS  Google Scholar 

  5. Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA 101, 1916–1921 (2004).

    Article  CAS  Google Scholar 

  6. Shaw, C.J. & Lupski, J.R. Implications of human genome architecture for rearrangement-based disorders: the genomic basis of disease. Hum. Mol. Genet. 13, R57–R64 (2004).

    Article  CAS  Google Scholar 

  7. Estivill, X. et al. Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome. Hum. Mol. Genet. 11, 1987–1995 (2002).

    Article  CAS  Google Scholar 

  8. Cheung, J. et al. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biol. 4, R25 (2003).

    Article  Google Scholar 

  9. Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).

    Article  CAS  Google Scholar 

  10. Tsui, C. et al. Single nucleotide polymorphisms (SNPs) that map to gaps in the human SNP map. Nucleic Acids Res. 31, 4910–4916 (2003).

    Article  CAS  Google Scholar 

  11. Hurles, M.E. Gene conversion homogenizes the CMT1A paralogous repeats. BMC Genomics 2, 11 (2001).

    Article  CAS  Google Scholar 

  12. Hurles, M. Are 100,000 “SNPs” useless? Science 298, 1509 (2002).

    Article  Google Scholar 

  13. Conant, G.C. & Wagner, A. Asymmetric sequence divergence of duplicate genes. Genome Res. 13, 2052–2058 (2003).

    Article  CAS  Google Scholar 

  14. Prince, J.A. et al. Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): design criteria and assay validation. Genome Res. 11, 152–162. (2001).

    Article  CAS  Google Scholar 

  15. Sebire, N.J., Fisher, R.A. & Rees, H.C. Histopathological diagnosis of partial and complete hydatidiform mole in the first trimester of pregnancy. Pediatr. Dev. Pathol. 6, 69–77 (2003).

    Article  Google Scholar 

  16. Kruglyak, L. & Nickerson, D.A. Variation is the spice of life. Nat. Genet. 27, 234–236 (2001).

    Article  CAS  Google Scholar 

  17. Smit, A.F. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9, 657–663 (1999).

    Article  CAS  Google Scholar 

  18. Jeffreys, A.J. & May, C.A. Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nat. Genet. 36, 151–156 (2004).

    Article  CAS  Google Scholar 

  19. Rozen, S. et al. Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature 423, 873–876 (2003).

    Article  CAS  Google Scholar 

  20. Hollox, E.J., Armour, J.A. & Barber, J.C. Extensive normal copy number variation of a beta-defensin antimicrobial-gene cluster. Am. J. Hum. Genet. 73, 591–600 (2003).

    Article  CAS  Google Scholar 

  21. Locke, D.P. et al. BAC microarray analysis of 15q11-q13 rearrangements and the impact of segmental duplications. J. Med. Genet. 41, 175–182 (2004).

    Article  CAS  Google Scholar 

  22. White, S.J. et al. Two-colour MLPA; detecting genomic rearrangements in hereditary multiple exostoses. Hum. Mutat. 24, 86–92 (2004).

    Article  CAS  Google Scholar 

  23. Schouten, J.P. et al. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 30, e57 (2002).

    Article  Google Scholar 

  24. Lucito, R. et al. Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. Genome Res. 13, 2291–2305 (2003).

    Article  CAS  Google Scholar 

  25. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  Google Scholar 

  26. Birney, E. et al. Ensembl 2004. Nucleic Acids Res. 32, D468–D470 (2004).

    Article  CAS  Google Scholar 

  27. Fredman, D., Jobs, M., Stromqvist, L. & Brookes, A.J. DFold: PCR design that minimizes secondary structure and optimizes downstream genotyping applications. Hum. Mutat. 24, 1–8 (2004).

    Article  CAS  Google Scholar 

  28. Carlson, C.S. et al. Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat. Genet. 33, 518–521 (2003).

    Article  CAS  Google Scholar 

  29. White, S. et al. Comprehensive detection of genomic duplications and deletions in the DMD gene, by use of multiplex amplifiable probe hybridization. Am. J. Hum. Genet. 71, 365–374 (2002).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank R.J. Fisher and M. Seckl for CHM DNA samples and R.A. Clark, S. Sawyer and C. Lagerberg for technical assistance. Funding was provided by Pfizer Corporation and Stiftelsen för Kompetens-och Kunskapsutveckling (to D.F. and A.J.B.) and by the US National Institutes of Health (to E.E.E.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony J Brookes.

Ethics declarations

Competing interests

A.J.B. declares share interests in Dynametrix Ltd.

Supplementary information

Supplementary Fig. 1

Average raw MLPA signal strength correlates with target sequence copy number. (PDF 198 kb)

Supplementary Fig. 2

MLPA and DASH correlation. (PDF 208 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fredman, D., White, S., Potter, S. et al. Complex SNP-related sequence variation in segmental genome duplications. Nat Genet 36, 861–866 (2004). https://doi.org/10.1038/ng1401

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng1401

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing