Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Targeted capture and massively parallel sequencing of 12 human exomes

Abstract

Genome-wide association studies suggest that common genetic variants explain only a modest fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability1,2. Although DNA sequencing costs have fallen markedly3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (‘exomes’), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of 12 humans. These include eight HapMap individuals representing three populations4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman–Sheldon syndrome (FSS)5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for Mendelian disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Minor allele frequency and coding indel length distributions.
Figure 2: Direct identification of the causal gene for a monogenic disorder by exome sequencing.

Similar content being viewed by others

References

  1. Cohen, J. C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004)

    Article  ADS  CAS  Google Scholar 

  2. Frazer, K. A., Murray, S. S., Schork, N. J. & Topol, E. J. Human genetic variation and its contribution to complex traits. Nature Rev. Genet. 10, 241–251 (2009)

    Article  CAS  Google Scholar 

  3. Shendure, J. & Ji, H. Next-generation DNA sequencing. Nature Biotechnol. 26, 1135–1145 (2008)

    Article  CAS  Google Scholar 

  4. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

  5. Toydemir, R. M. et al. Mutations in embryonic myosin heavy chain (MYH3) cause Freeman-Sheldon syndrome and Sheldon-Hall syndrome. Nature Genet. 38, 561–565 (2006)

    Article  CAS  Google Scholar 

  6. Sjoblom, T. et al. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268–274 (2006)

    Article  ADS  Google Scholar 

  7. Olson, M. Enrichment of super-sized resequencing targets from the human genome. Nature Methods 4, 891–892 (2007)

    Article  CAS  Google Scholar 

  8. Hodges, E. et al. Genome-wide in situ exon capture for selective resequencing. Nature Genet. 39, 1522–1527 (2007)

    Article  CAS  Google Scholar 

  9. National Center for Biotechnology Information. Consensus CDS protein set <http://www.ncbi.nlm.nih.gov/projects/CCDS> (2009)

  10. Ng, P. C. et al. Genetic variation in an individual human exome. PLoS Genet. 4, e1000160 (2008)

    Article  Google Scholar 

  11. Kidd, J. M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008)

    Article  ADS  CAS  Google Scholar 

  12. Bentley, D. R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008)

    Article  ADS  CAS  Google Scholar 

  13. Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008)

    Article  CAS  Google Scholar 

  14. Campbell, P. J. et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nature Genet. 40, 722–729 (2008)

    Article  CAS  Google Scholar 

  15. Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8, 186–194 (1998)

    Article  CAS  Google Scholar 

  16. Turner, E. H., Lee, C., Ng, S. B. & Shendure, J. Massively parallel exon capture and library-free resequencing across 16 individuals. Nature Methods 6, 315–316 (2009)

    Article  CAS  Google Scholar 

  17. Kidd, J. M. et al. Haplotype sorting using human fosmid clone end-sequence pairs. Genome Res. 18, 2016–2023 (2008)

    Article  CAS  Google Scholar 

  18. Albert, T. J. et al. Direct selection of human genomic loci by microarray hybridization. Nature Methods 4, 903–905 (2007)

    Article  CAS  Google Scholar 

  19. Wheeler, D. A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008)

    Article  ADS  CAS  Google Scholar 

  20. Wang, J. et al. The diploid genome sequence of an Asian individual. Nature 456, 60–65 (2008)

    Article  ADS  CAS  Google Scholar 

  21. Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007)

    Article  Google Scholar 

  22. Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008)

    Article  ADS  CAS  Google Scholar 

  23. Boyko, A. R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008)

    Article  Google Scholar 

  24. Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)

    Article  CAS  Google Scholar 

  25. Yngvadottir, B. et al. A genome-wide survey of the prevalence and evolutionary forces acting on human nonsense SNPs. Am. J. Hum. Genet. 84, 224–234 (2009)

    Article  CAS  Google Scholar 

  26. Olson, M. V. When less is more: gene loss as an engine of evolutionary change. Am. J. Hum. Genet. 64, 18–23 (1999)

    Article  CAS  Google Scholar 

  27. Cohen, J. et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9 . Nature Genet. 37, 161–165 (2005)

    Article  CAS  Google Scholar 

  28. Jones, S. et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science 324, 217 (2009)

    Article  ADS  CAS  Google Scholar 

  29. Siva, N. 1000 Genomes project. Nature Biotechnol. 26, 256 (2008)

    Article  Google Scholar 

  30. Kryukov, G. V., Shpunt, A., Stamatoyannopoulos, J. A. & Sunyaev, S. R. Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl Acad. Sci. USA 106, 3871–3876 (2009)

    Article  ADS  CAS  Google Scholar 

Download references

Acknowledgements

For discussions or assistance with genotyping data, we thank P. Green, J. Akey, R. Patwardhan, G. Cooper, J. Kidd, D. Gordon, J. Smith, I. Stanaway and M. Rieder. For assistance with project management, computation, data management and submission, we thank E. Torskey, S. Thompson, T. Amburg, B. McNally, S. Hearsey, M. Shumway and L. Hillier. For Human1M-Duo genotype data on HapMap samples, we thank Illumina. Our work was supported in part by grants from the National Institutes of Health/National Heart Lung and Blood Institute, the National Institutes of Health/National Human Genome Research Institute, National Institutes of Health/National Institute of Child Health and Human Development, and the Washington Research Foundation. S.B.N. is supported by the Agency for Science, Technology and Research, Singapore. E.H.T. and A.W.B. are supported by a training fellowship from the National Institutes of Health/National Human Genome Research Institute. E.E.E. is an investigator of the Howard Hughes Medical Institute.

Author Contributions The project was conceived and experiments planned by S.B.N., E.H.T., A.B., E.E.E., M.B., D.A.N. and J.S. Experiments were performed by S.B.N., E.H.T., C.L. and M.W. Algorithm development and data analysis were performed by S.B.N., P.D.R., S.D.F., A.W.B., T.S., M.B., D.A.N. and J.S. The manuscript was written by S.B.N. and J.S. All aspects of the study were supervised by J.S.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sarah B. Ng or Jay Shendure.

Ethics declarations

Competing interests

COMPETING INTERESTS: A.B. is an employee of Agilent Technologies. Agilent supplies arrays that can be used for exome capture as described.

Additional information

The authors declare competing financial interests: details accompany the full-text HTML version of the paper at www.nature.com/nature.

Supplementary information

Supplementary Information

This file contains Supplementary Figures 1-6 with Legends and Supplementary Tables 1-5. (PDF 161 kb)

Supplementary Data 1

This file lists intervals within the targeted exome that were excluded from consideration based on poor anticipated mappability with 76 bp single-end reads. (TXT 211 kb)

Supplementary Data 2

This file lists the fraction of targeted coding bases in each gene that were covered in each of 12 individuals (either with >=1x coverage or with sufficient coverage to variant call). (TXT 2828 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ng, S., Turner, E., Robertson, P. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009). https://doi.org/10.1038/nature08250

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature08250

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing