Article Text

Download PDFPDF

Position statement
CCMG practice guideline: laboratory guidelines for next-generation sequencing
  1. Stacey Hume1,
  2. Tanya N Nelson2,3,4,
  3. Marsha Speevak5,
  4. Elizabeth McCready6,
  5. Ron Agatep7,8,
  6. Harriet Feilotter9,
  7. Jillian Parboosingh10,11,
  8. Dimitri J Stavropoulos12,13,
  9. Sherryl Taylor1,
  10. Tracy L Stockley13,14
  11. On behalf of Canadian College of Medical Geneticists (CCMG)
  1. 1 Department of Medical Genetics, University of Alberta, Edmonton, Alberta, Canada
  2. 2 Department of Pathology and Laboratory Medicine, BC Children's Hospital, Vancouver, British Columbia, Canada
  3. 3 Department of Pathology and Laboratory Medicine, The University of British Columbia, Vancouver, British Columbia, Canada
  4. 4 Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
  5. 5 Department of Laboratory Medicine and Genetics, Trillium Health Partners, Mississauga, Ontario, Canada
  6. 6 Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Ontario, Canada
  7. 7 Department of Biochemistry and Molecular Genetics, University of Manitoba, Winnipeg, Manitoba, Canada
  8. 8 Genomics Laboratory, Shared Health Diagnostic Services, Winnipeg, Manitoba, Canada
  9. 9 Department of Pathology and Molecular Medicine, Queen’s University, Kingston, Ontario, Canada
  10. 10 Department of Medical Genetics, University of Calgary Cumming School of Medicine, Calgary, Alberta, Canada
  11. 11 Research Institute, Alberta Children’s Hospital, Calgary, Alberta, Canada
  12. 12 Department of Paediatric Laboratory Medicine, Genome Diagnostics, Hospital for Sick Children, Toronto, Ontario, Canada
  13. 13 Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
  14. 14 Department of Clinical Laboratory Genetics, Laboratory Medicine Program, University Health Network, Toronto, Ontario, Canada
  1. Correspondence to Dr Tracy L Stockley, Department of Clinical Laboratory Genetics, Laboratory Medicine Program, University Health Network, Toronto, ON M5G 2C4, Canada; tracy.stockley{at}uhn.ca

Purpose

The purpose of this document is to provide guidance for the use of next-generation sequencing (NGS, also known as massively parallel sequencing or MPS) in Canadian clinical genetic laboratories for detection of genetic variants in genomic DNA and mitochondrial DNA for inherited disorders, as well as somatic variants in tumour DNA for acquired cancers. They are intended for Canadian clinical laboratories engaged in developing, validating and using NGS methods.

Methods of statement development The document was drafted by the Canadian College of Medical Geneticists (CCMG) Ad Hoc Working Group on NGS Guidelines to make recommendations relevant to NGS. The statement was circulated for comment to the CCMG Laboratory Practice and Clinical Practice committees, and to the CCMG membership. Following incorporation of feedback, the document was approved by the CCMG Board of Directors.

Disclaimer The CCMG is a Canadian organisation responsible for certifying medical geneticists and clinical laboratory geneticists, and for establishing professional and ethical standards for clinical genetics services in Canada. The current CCMG Practice Guidelines were developed as a resource for clinical laboratories in Canada and should not be considered to be inclusive of all information laboratories should consider in the validation and use of NGS for a clinical laboratory service.

  • diagnostics tests
  • genetics
  • guidelines
  • molecular genetics

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Purpose and scope

These guidelines provide recommendations for the use of next-generation sequencing (NGS; also known as massively parallel sequencing, MPS) in Canadian clinical genetic laboratories. They are intended for Canadian clinical laboratories engaged in developing, validating and using NGS methods. In addition, these guidelines may be useful for Canadian laboratory accreditation bodies or manufacturers developing NGS-related products for clinical laboratories in Canada, or serve as a reference for key issues that should be considered by Canadian practitioners using NGS services originating outside of Canada. The recommendations within this document cover the NGS process from template preparation to clinical reporting, including consideration of validation and quality assurance for laboratory processes. The NGS applications addressed include detection of genetic variants in genomic DNA and mitochondrial DNA for inherited disorders, as well as somatic variants in tumour DNA for acquired cancers.

Both the American College of Medical Genetics and Genomics and Eurogentest/European Society of Human Genetics have published valuable, detailed guidelines for NGS use in clinical testing.1 2 This document is not intended to recapitulate previously published guidelines but rather to highlight issues unique within the Canadian healthcare context.

The sustainability of Canadian healthcare, publicly funded through the Canada Health Act, necessitates laboratories to consider the most appropriate use of new technologies such as NGS, so that the cost of testing is balanced against the patient-derived benefit. In Canada, while Health Canada “regulates legislation on a variety of topics that aim to protect the health of Canadians”3 through observing the principles of the Canada Health Act, each province also has its own specific healthcare legislation, regulations and laboratory accreditation requirements. While there is common use of certain international laboratory accreditation standards between provinces, particularly the ISO 15189:2012 standard,4 provincial differences in accreditation requirements exist. Defining and recommending Canadian guidelines for NGS will provide a framework of national standards for clinical laboratories which can be integrated into provincial accreditation programmes. It is recommended that until all provincial laboratory accreditation programmes develop requirements for NGS, clinical laboratories in Canada performing NGS should strive to meet the criteria in this document.

Canada has unique considerations with respect to privacy related to genetic information and protection from genetic discrimination. Legislation and regulation relevant to health information exists at both the federal level (Personal Information Protection and Electronic Document Act; PIPEDA) and in some cases at the provincial level, where many provinces have privacy legislation and regulations that apply specifically to health information including e-health information. Protection from genetic discrimination, both as it relates to private health insurance and to employment, is provided by the Genetics Non-discrimination Act (Bill S-201). The current guidelines also aim to provide a framework for managing health information and other relevant data related to NGS within the Canadian privacy legislation context. A summary of the recommendations in this guideline are provided in online supplementary appendix 1.

Supplemental material

Introduction

Overview of NGS technologies and applications in Canadian clinical laboratories

NGS is a key methodology in clinical molecular diagnostic laboratories in Canada. NGS using short-read platforms5 is most frequently employed in Canadian clinical genetic laboratories for the diagnosis of inherited diseases and for the detection of somatic variants in acquired cancers. Canadian laboratories may also use NGS to detect variants in mitochondrial genomes, or to investigate cell-free nucleic acids in peripheral circulation for non-invasive prenatal testing (NIPT) or assessment of circulating tumour DNA. NGS-based assays may be used to assess whole genomes, exomes, specific gene panels, single genes or recurrent variant ‘hotspot’ regions. The choice of target depends on the specific application and the clinical utility of the test, an important concept within the publicly funded healthcare system in Canada. Thus, variants identified by NGS may be filtered for efficiency of workflows and to avoid unnecessary impacts on the public healthcare system, for example, assessing only those genes with demonstrated clinical evidence for utility for a specific indication, or using the same technical NGS test for multiple clinical indications while minimising incidental variants unrelated to the original clinical question.

Definitions and abbreviations

Bioinformatics—the application of computational and statistical sciences to the collection, organisation and analysis of biological data.

Base calling quality score (alias PHRED or Q score)—a value indicating the probability that a base call generated by the software is accurate. A higher score indicates an increased likelihood that the call is correct at a specific base (Q=−10log10(e) where e is the error rate).

Bridge amplification—a PCR variation in which forward and reverse primers are embedded on a solid surface. Amplicons form a bridge on the solid surface to allow annealing to complementary forward and reverse primers at each cycle.

CNV (copy number variant or variation)—a region that contains gains or losses of genetic material. This may involve a single exon through to several thousands of kilobases of DNA and may be clinically benign, uncertain or pathogenic.

FFPE—formalin fixed, paraffin embedded.

GWS (genome-wide sequencing)—a generic term for the process used to determine the sequence of most, if not all, clinically significant genes and its associated interpretation, including bioinformatic analysis and clinical genotype–phenotype correlation. In the context of clinical GWS, this approach would be undertaken by an appropriately certified laboratory to address a clinical question.

HGVS—Human Genome Variation Society.

Incidental findings—although this term does not have a consensus definition with respect to NGS, this document intends incidental findings to be genetic variant(s) that are located in genes that have been associated with the primary indication for testing, but the impact of the variant is unrelated to the primary indication for testing.

Indel—insertion/deletion variant.

Library—preparation of the target nucleic acid (DNA or RNA) for sequencing, usually involving enrichment or uniform fragmentation, and modification with the addition of 3′ and 5′ adaptors.

Mapping quality scores—a value indicating the probability that a read is aligned to the correct location of a reference sequence

Mapping fraction—the proportion of reads from an NGS run that align to the reference sequence

Massively parallel sequencing (MPS; also known as next-generation sequencing, NGS)—high-throughput technologies used to determine nucleotide sequences and genome dosage at numerous loci using a single test, including targeted variant, single gene, targeted gene panels, whole exomes and/or whole genome sequence determination. Although MPS is the more accurate name for the technology, NGS is more commonly used.

Next-generation sequencing—see massively parallel sequencing

NIPT/NIPS (non-invasive prenatal testing, also known as non-invasive prenatal screening)—a prenatal test performed on cell-free nucleic acids from maternal serum or plasma that provides a probability of risk for a fetal aneuploidy or for a specific genetic variant.

PIPEDA—Personal Information Protection and Electronic Document Acts.

QC—quality control.

Q score—see base calling quality score

Read depth—the number of sequence reads at a particular base; each read preferably represents a unique molecule of genomic DNA, although this is dependent on assay design

Read length—the length of the sequence achieved during an NGS run.

Sample barcode (alias index)—a unique sequence added to an individual sample during library preparation, typically used to allow multiple libraries (samples) to be sequenced at the same time.

Sanger sequencing—conventional sequencing method using selective incorporation of chain terminating dideoxy nucleotides.

Secondary findings—although this term does not have a consensus definition with respect to NGS, this document intends secondary findings to be genetic variant(s) that are located in genes other than those that have been associated with the primary indication for testing and may have an impact on health of the patient and family members

SBS (sequencing by synthesis)—a sequencing method based on the identification of single bases as they are incorporated into a newly synthesised DNA strand. Detection most commonly relies on either cyclical removal of reversible terminator molecules or single-nucleotide amplification.

Sequencing coverage (or depth of coverage)—the number of NGS reads that map to the target.

Semiconductor sequencing—a method of sequence by synthesis based on the detection of hydrogen ions released from nucleotides that are incorporated into a newly synthesised strand. Nucleotides (dATP, dCTP, dGTP or dTTP) are added one at a time across growing nucleotide strands, so that when released hydrogen ions are detected, the nucleotide incorporated can be inferred

Short read sequencing—massively parallel sequencing of relatively short read lengths (approximately 100 to 600 bp).

SNP—single-nucleotide polymorphism.

SNV—single nucleotide variant.

Target enrichment—selective enrichment of genomic targets from a DNA source prior to sequencing. Enrichment of a DNA preparation for specific targets may occur by either PCR amplification or probe hybridisation.

Validation—testing performed as part of a quality assurance programme to determine performance metrics and document evidence that the assay fulfils the requirements for its intended purpose prior to implementation into service.4

Verification—testing performed as part of a quality assurance programme to confirm that an assay performs as expected according to the performance metrics defined during previous test validation.4

Recommendations for technical procedures: target regions, template preparation and sequence generation

Isolation and quantification of nucleic acids

Guidelines for validation of acceptable types of specimens and extraction methods used for NGS testing currently exist.1 6 7 Laboratories should identify the relevant sample types required for the purpose of the test (eg, blood, buccal cells, cultured cells or formalin-fixed, paraffin-embedded (FFPE) or fresh tissue) and validation should consider issues relevant to all sample types and nucleic acid isolation methodologies used.

Target enrichment and library construction

Recommendation 1: The NGS assay target region shall be defined; those areas that do not meet assay quality metrics shall be tested using an alternate method or removed from the reported target region.

Recommendation 2: Assay quality metrics for successful target enrichment and library construction shall be defined during validation for the specific method and intended use.

Target enrichment is the process by which parts of the genome are selected for sequencing. Currently, three approaches to target enrichment are used in Canadian clinical laboratories: (1) enrichment for all coding sequences within the genome (exome enrichment); (2) enrichment for coding sequences of individual or select genes (gene panels); (3) genomic regions with known clinically relevant variants (hotspot panels). Enrichment strategies may combine approaches, such as targeted gene panels that also contain hotspot variants. A number of variables need to be considered in target enrichment, including but not limited to on-target read fraction, cost per sample, available instrumentation, data storage, enrichment workflow, available analytical tools/bioinformatics resources and risk of incidental findings.

Both amplification-based and hybridisation-based methods for target enrichment are used in Canadian laboratories. Amplification-based target enrichment methods use highly multiplexed PCR primers to generate PCR products of a size amenable to the sequencing platform (eg, 150–400 bp). All PCR methods are susceptible to allele dropout due to the presence of SNVs at primer binding sites, which may cause regions of low read depth and/or unanticipated homozygosity in NGS results. Hybridisation-based target enrichment uses complementary target-specific DNA or RNA oligonucleotide ‘baits’ to hybridise and capture genomic DNA (fragmented enzymatically or via physical shearing). Regions that are difficult to assess using either method include genes with pseudogenes, repetitive regions and regions with high GC content (eg, exon 1 of many genes). Such problematic regions should be tested using an alternate method to backfill, such as Sanger sequencing, or removed from the reported region. Backfill should be considered for critical regions, such as those containing hotspot variants, or when needed to ensure quality and patient safety. Laboratories should define their criteria for when backfill will be done.2

A unique sequence (barcode or index) is often added to DNA fragments from each sample in the library preparation to allow for multiple samples to be sequenced simultaneously. Any two barcodes must differ at more than one base pair to avoid errors during synthesis or sequencing, which would result in the conversion of one barcode into another and, thus, sequence mis-assignment. Laboratories shall define the quality metrics and thresholds that indicate an optimal run for target enrichment, including details of barcodes if used. Methods should be implemented to ensure sample identity is maintained throughout the NGS process, such as the use of spike-in synthetic DNA standards or a SNP panel.2 8–10

Sequence generation

Recommendation 3: Sequence generation data quality metrics shall be defined for each specific application.

The main approach to sequence generation in Canadian clinical laboratories at this time is short-read NGS using sequencing by synthesis (SBS; for review, see Goodwin et al 5). Each SBS platform has specific issues related to sequence generation that must be considered during test validation and implementation. For example, in semiconductor sequencing, the addition of multiple bases in homopolymer regions can lead to problems in the detection of the correct number of nucleotides at those sites; in bridge amplification, the quality of base calling is known to degrade towards the end of reads.

Clinical laboratories shall have a full understanding of the chemistry behind the sequence generation for each NGS platform used, and validation should include assessment of the known platform-specific issues relevant to the test. Appropriate indicators and acceptable thresholds of data quality during NGS sequence generation should be defined, for example, average and minimum read depth, proportion of bases above a specified quality score, percentage of reads with adequate mapping quality, and other parameters and thresholds that define an acceptable run for each specific test.1 6 7 Minimum acceptable read depth for all nucleotides in the target region should be established prior to validation and will depend on the technology used for sequencing, capture method and desired sensitivity of the assay (eg, sensitivity differences for germline vs somatic variant detection).1 6

Recommendations for bioinformatic analysis and variant annotation

Sequence alignment

Recommendation 4: Sequence alignment quality metrics shall be identified and thresholds defined for acceptable alignment.

Recommendation 5: Consideration shall be given to reducing the risk of incorrect variant calls by appropriate investigation of genomic regions of known homology, such as pseudogenes.

Sequence reads generated in the primary phase of the analysis pipeline are mapped or aligned to a reference sequence using various alignment algorithms. Alignment tools vary in their ability to accurately align sequence reads. Key quality metrics for alignment (eg, mapping quality scores or % alignment (mapping fraction)) shall be identified and thresholds established to determine acceptability of a sequence alignment for variant calling.

During validation, the laboratory should evaluate targeted genomic regions known or expected to cause errors in bioinformatic approaches. Regions with known homology to other genomic regions (eg, pseudogenes), or segmental duplications, should be investigated to prevent potential false-positive variant calls or false-negative results. During validation, laboratories should review the genomic regions included in the NGS assay to identify potential issues in alignment and assess the mapping quality within these regions.8

Variant calling

Recommendation 6: Bioinformatic tools shall be assessed for the ability to reliably detect clinically relevant variant types.

The detection of different types of clinically relevant variants from NGS data (SNVs, indels, CNVs, etc) may require use of bioinformatic tools specific to the variant types of interest. While SNVs are typically identified by alignment to a reference, insertions and deletions require different approaches. With insertion/deletion (indel) variants, there is risk of error in variant identification due to either misalignment of reads to the reference genome, or the discard of reads containing large insertions or deletions that do not align well to the reference sequence. A key aspect of assessing indel variant calling tools is to experimentally determine the largest size of insertion or deletion that can be identified, and then state this as the upper limit of size detection. In addition, bioinformatic tools’ indel variant nomenclature may not follow standard rules as defined by the Human Genome Variation Society (HGVS), such as use of the most 5′ position as opposed to the HGVS recommended most 3′ position of the reference sequence, and this should be considered during validation and reporting.

To identify exon-level CNV variants, a common approach is based on the hypothesis that the NGS read depth in a genomic region is correlated to copy number of that region in a sample. This may require comparison of test sample read depth with a reference sample (or pooled reference samples) read depth. Consideration must be given to the size of CNV that can be detected by bioinformatic approaches. For example, certain types of CNVs may be adequately detected, such as large deletions, whereas other types of CNVs, such as those smaller than 1 Kb, those in regions of high GC content or duplications, may not be. Laboratories should determine the size limits (upper and lower) of CNVs that can be reliably detected.

During validation, the laboratory should determine the reliability of detection for each variant type expected, and any associated limitations. In certain cases, laboratories may need to verify variants using orthogonal methods during ongoing clinical testing11 (also see section entitled ‘Target enrichment and library construction’).

Variant annotation and interpretation

Recommendation 7: Laboratories shall use published guidelines for variant classification and interpretation.

Variant annotation, classification and interpretation are not unique for NGS assays; general concepts can be applied to the interpretation of inherited, somatic and mitochondrial genome variants (see table 1). Guidelines for the classification and interpretation of the clinical relevance of inherited12 and somatic13 14 variants have been published. The guidelines in Richards et al 12 have been endorsed by the Canadian College of Medical Geneticists (CCMG). Guidelines for the interpretation and classification of mitochondrial genome variants are less developed, although comments are included within.12 Although a specific guideline for interpretation of CNVs from NGS data does not currently exist, if a CNV is detected and the approximate CNV breakpoints can be ascertained, the guideline for interpretation of a constitutional CNV identified by chromosomal microarray technologies can be applied to NGS.15 Guidelines for classification of CNVs recommend consideration of all the genes in the maximum CNV interval; depending on the NGS assay design, the maximum size of a CNV may not be possible, particularly for targeted panels that may not include neighbouring genes.

Table 1

Concepts relevant to the interpretation of variants for inherited genetic disorders, acquired somatic cancers and mitochondrial genome disorders

Analysis of variant allele frequency

Recommendation 8: For inherited disorders, laboratories shall define the variant allele frequency range corresponding to the heterozygous and homozygous state.

Recommendation 9: For acquired cancer or disorders of the mitochondrial genome, laboratories shall define the lower limit of variant allele frequency detection.

Recommendation 10: For acquired cancer or disorders of the mitochondrial genome, laboratories shall define the precision of the assay across the clinically relevant range of expected variant allele frequencies.

NGS clinical testing must detect variants at clinically relevant variant allele frequencies (or fraction) specific to the disorder being tested. A significant factor affecting variant allele frequency is the number of unique sequence reads at a particular base pair, with higher numbers of unique sequence reads (higher read depth) enabling improved detection of variants at low variant allele frequencies.

Validation of NGS for inherited disorders shall ensure both heterozygous and homozygous variant detection, and the limits of the variant allele frequency for each zygosity. In some cases, lower-than-expected variant allele frequency may be detected, which in the context of inherited disease could be indicative of somatic mosaicism or mapping to pseudogenes or segmental duplications. The laboratory should develop an internal policy regarding their approach to potential somatic mosaicism in inherited disorder testing. When relevant, clinical reports should state whether mosaicism would be detected and the estimated lower limit of variant allele frequency detectable by the assay.

For acquired cancers and disorders of the mitochondrial genome, variants may be detected at variant allele frequencies ranging from 1% to 100%. Thus, these assays require definition of the reliable lower limit of variant allele frequency detection. In some cases, verification of variants detected at low levels by NGS may be required to distinguish from instrument errors and laboratories should define the scenarios in which additional verification is necessary.11 In addition, the precision of the NGS assay at detecting variant allele frequencies across the clinically relevant range should be assessed. This can be achieved by replicate testing assessing samples with various allele frequencies as previously identified by orthogonal methods or by use of a dilution series of DNA from two cell lines.6 16

For acquired cancers, if the allelic frequency for a variant suggests heterozygosity (expected near 50%, although a range exists) or homozygosity (expected near 100%) in a gene with known hereditary cancer association, then genetic counselling and risk assessment for an inherited disorder should be recommended. Note that the test methods and starting material may affect the allelic frequency for variants identified in somatic cancer tissue.

Data storage

Recommendation 11: Laboratories shall retain variant call format (VCF) files analogous to other data interpreted to generate the final clinical report. Where no local retention standards exist, the VCF shall be retained for at least 2 years. Strong consideration should be given to retaining some form of the raw data for a defined period of time.

Recommendation 12: Laboratories shall ensure data storage (including cloud storage if used) complies with Canadian federal and provincial privacy legislation.

Canadian clinical laboratory data retention is distinct from other jurisdictions given national and provincial privacy legislation and regulation, and provincial accreditation standards. Thus, it is important to define the specific files from NGS testing that should be retained for potential future re-analysis. Approximate sizes of various NGS files are described in He et al.17 The raw sequence data generated by an NGS instrument (FASTQ file) contains sequence reads and associated per base quality score.18 During secondary analysis, the Sequence Alignment/Map (SAM) file is produced, which stores alignments of reads against the reference sequence.19 A companion file is the Binary Alignment/Map (BAM), which retains the same information as the SAM file in a binary representation. The final significant file format is the VCF file, which is a generic format used to store variant information (position, nucleotide observed) and can contain associated annotations (eg, minor allele frequency in a control population, location relative to a gene, HGVS description) as generated by the bioinformatic pipeline.20

Decisions regarding retention of files should consider the patient context and legal obligations. For example, testing of minors may require longer retention times than for adults, and testing for inherited disorders with familial implications may require a longer retention than acquired disease testing. Although VCF files range in size, they are small in relationship to the FASTQ and BAM files (see He et al 17), and are amenable to long-term storage. Laboratories shall retain the VCF file, as a record of variants at the time of reporting, analogous to other final laboratory data interpreted when generating the clinical report, and in accordance with local practices and provincial guidelines; at minimum, the VCF shall be retained for 2 years. Consideration should be given to regaining some form of raw data (eg, BAM files or compressed versions thereof) for a defined period, for example, one cycle of proficiency testing.

Cost–benefit analysis should be performed to determine the most ideal retention plan for FASTQ/SAM/BAM files. While retention of FASTQ files potentially allows re-analysis using current or improved pipelines, the large size can be prohibitive for storage. The smaller-sized BAM files (or a compressed version such as CRAM21) may be useful to retain in lieu of FASTQ files for a period post-reporting, if desired, in order to be able to review the individual reads used at time of reporting. However, if a laboratory intends to provide re-analysis of data at a later date (eg, re-analysis of exome data), it may be necessary to retain the FASTQ files. Alternatively, given the high cost of long-term storage of FASTQ/SAM/BAM files, it may be more cost-effective to re-test the patient in the future, when or if necessary.

Use of cloud servers is an option for data analysis and storage.22 For any storage of files with genetic data linked to personally identifying information, laboratories shall ensure storage (including cloud storage) complies with Canadian federal (PIPEDA) and provincial privacy legislation.

Recommendations for test validation or verification

General issues for validation or verification

Recommendation 13: Validation or verification of NGS assays shall encompass the complete end-to-end process, including the wet-laboratory steps and data analysis pipeline.

Recommendation 14: Validation or verification is required when modifying a previously validated NGS assay, and should be appropriate to the extent of the modification.

Validation of an NGS assay, to confirm that the requirements for a specific intended use have been fulfilled, shall encompass the end-to-end process that includes the wet-laboratory steps, as well as the data analysis pipeline. Whereas, verification is typically limited to assessing the in-house performance of a validated (eg, Health Canada or FDA approved) in vitro diagnostic medical device, as supplied by a manufacturer or confirming an assay meets defined quality metrics after a minor test alteration.

The selection of validation samples will depend on the nature of the genetic test and should include those of the same tissue or tumour type that will be tested by the clinical assay. Other sources of validation samples can include well-characterised cell lines to provide additional variants within the target region.23 In somatic variant detection, control samples with a range of variant allele frequencies for target regions should be used.6 24 Specific aspects of NGS relevant to validation are also provided in sections ‘Recommendations for technical procedures’ and ‘Recommendations for bioinformatic analysis and variant annotation’.

Laboratories should give consideration to the extent of validation or verification needed when changing a previously validated NGS assay, such as addition of a gene to a previously validated panel or changing a version of a single bioinformatic pipeline tool. The validation or verification design should determine the test aspects at risk due to the impending change, and appropriately assess those modified aspects throughout the NGS end-to-end process.

Estimating analytical sensitivity and specificity

Recommendation 15: Validation of large panel or genome-wide NGS assays shall include at least 60 variants, including at least 10 variants of each specific variant type to be detected by the clinical assay.

Recommendation 16: Validation of genome-wide NGS assays should include the use of well-characterised samples for which consensus variants are known.

Recommendation 17: The minimum read depth required for a desired sensitivity should be established for each assay.

Two factors must be considered when calculating analytical sensitivity: the variant type that can be detected by the instrument and the minimum read depth required to ensure a variant would be detected. The number of variants included in the validation will influence the statistical confidence of the calculated analytical sensitivity. A guideline for estimating power of a DNA sequencing validation study is provided in Mattocks et al.25 A sample size of 60 variants is estimated to provide a maximum sensitivity of 95% (within a CI of 95%), when all 60 variants are detected by the new technology; whereas, a sample size of 300 variants increases maximum sensitivity to 99% (within a CI of 95%). A similar calculation specific to oncology applications is provided in Jennings et al,6 which calculates that to be 95% confident with at least 95% reliability, the minimum number of variants for validation is 59. Thus, it is recommended that the validation of a methodology include validation for each variant type (eg, SNVs, indels), with at least 60 variants total. Given the difficulty in obtaining certain variant types (such as indel variants), it is recommended that of the overall 60 variants, a minimum of 10 variants of each clinically relevant type are assayed. A guideline for calculating CIs for analytical sensitivity is provided by Rehm et al.1 If the targeted number of variants is not possible during validation, for example, due to the rarity of certain variant types, then verification of additional variants by an orthogonal method should continue during clinical implementation until this target is reached.

As the variants detected during the validation will likely not be in those regions of lower read depth, the sensitivity of the assay at decreasing read depths must be ascertained. A common method for decreasing the read depth involves downsampling (sometimes referred to as subsampling) the sequence of well-characterised samples to fractions of the original depth.23 The rate of detection for the well-established variants can then be assessed at different read depths.26 Once a given threshold is determined, the laboratory can confirm the number of samples that can be pooled while still achieving the minimum read depth.

In estimating specificity, it is necessary to calculate the proportion of variants identified by the NGS assay that are not present in the validation samples (ie, false-positive calls). There are a number of issues that can overestimate the number of false-positive calls including (1) variants arising during culturing of cell lines, (2) differences in sensitivity for detecting mosaic variants between NGS and other methods such as Sanger sequencing, and (3) allele dropout during amplification-based methods. The acceptable test specificity will vary between clinical applications and depend on whether variants are validated by another method prior to clinical reporting.

The experimental design for validation studies will depend on variant types that are known to be clinically relevant. For example, it is recommended that recurrent variants known to contribute to a significant proportion of diagnoses for a specific disorder should be included in samples tested during validation. Alternatively, if this is not feasible, genomic regions harbouring these variants should meet minimum quality parameters to ensure variant detection. Common variants not associated with disease can be included in the validation metrics since the performance characteristics for detection of a specific variant type (eg, SNV) is anticipated to be equivalent whether pathogenic or benign.

Development and validation of clinical exome or whole-genome sequencing tests should follow the same general principles as gene panel testing. Additionally, validation samples should include well-characterised cell lines.23 In calculating the sensitivity and specificity of the exome or whole-genome sequencing, the laboratory should have parameters in place to flag and exclude variants from genomic regions with repetitive sequences (eg, segmental duplications) and targets with significant homology in other regions of the genome (eg, pseudogenes) that can result in false-positive and false-negative results (also see the section entitled ‘Recommendations for technical procedures’).

Recommendations for ongoing quality assurance

General quality assurance

Once an NGS assay has been clinically implemented, the reagents, equipment and software used in the NGS assay are subject to ongoing quality assurance (QA) to ensure test performance and data integrity, as with all clinical assays. Many of the QA requirements will be the same regardless of the specific clinical application or methodological approach. Quality assurance practices, such as proficiency testing and appropriate controls, can be applied to the method or technique rather than for each specific gene within the assay. Note that specific quality assurance metrics and thresholds are described in sections ‘Recommendations for technical procedures’ and ‘Recommendations for bioinformatic analysis and variant annotation’.

Assay controls

Recommendation 18: All NGS assays should use appropriate measures to assess for potential contamination.

Recommendation 19: Sensitivity controls shall be included to ensure the lower limit of detection is maintained, as applicable.

Measures should be taken to prevent and detect contamination. For example, this may be done by using no template control in the steps of the protocol where risk of contamination is greatest (eg, in library preparation up to the point of sequencing). Another example is use of a bioinformatics pipeline step to monitor low-frequency variants for inherited disease testing. Other approaches may be valid. Preventative measures may be useful, for example, use of alternating barcodes between runs.

In some scenarios, such as somatic or mitochondrial genome testing, the laboratory should establish a schedule to test a sensitivity control to ensure the validated lower limit of detection is maintained.6 It is recommended that these controls be selected to represent specific variants or types of variants to regularly verify assay performance.

Bioinformatics ongoing quality assurance

Recommendation 20: Laboratories shall establish a procedure to implement and track software versions and monitor for updates.

Recommendation 21: Reference sequences and databases used should periodically be reviewed to ensure appropriate versions are in use.

The laboratory must establish a procedure to monitor availability of software updates, and when an applicable update is identified, establish criteria for implementation. This procedure may include a review to assess the nature of the version change (critical, useful or unnecessary), associated risks, impact, and need for validation or verification. The need for either a validation or verification should be made based on the extent of changes made and based on the potential risk of errors in the changes (also see section entitled ‘Recommendations for test validation or verification’). If validation or verification is necessary, any changes to default settings shall be documented, using version control. It may also be necessary to validate or verify other components, or the entire pipeline, depending on the context of the overall bioinformatics process. As necessary, validation or verification of software updates can be performed using established synthetic electronic datasets, archived data or through analysis of biological samples.8

In addition to updates to bioinformatic software, reference sequences and databases used by software are also subject to updates. It is recommended that the reference sequences or databases used be periodically reviewed to ensure use of the appropriate version, and that any changes are documented through version control.

Ongoing evaluation and updating of NGS assays

Recommendation 23: Laboratories shall define procedures for periodic evaluation of the clinical utility of each targeted NGS assay.

Recommendation 24: If changes are made to the genes analysed in an NGS assay, laboratories shall communicate these gene changes to clinical stakeholders.

Recommendation 25: Laboratories shall only review the classification of a previously reported variant at the request of a healthcare provider acting on behalf of the patient.

To evaluate the ongoing clinical utility of targeted NGS assays, laboratories may undertake literature reviews or consult with clinical colleagues or other experts in order to determine the continued effectiveness of assays. As with all laboratory tests, assay changes should be communicated to clinical stakeholders. The addition of new genes to panels, or a change in filters, will not automatically require the laboratory to reanalyse or issue a new report for individuals analysed prior to the change. The laboratory should require a specific request from a referring healthcare provider to reanalyse data, retest sample or reissue reports, and details of this process should be contained within a laboratory policy.

Reassignment of variant pathogenicity is not unique to NGS, but does occur more frequently due to the breadth of the analysis, and existing gaps in knowledge regarding variant impact on protein structure and function. Reassignment of variant classification may raise concerns regarding recontact of impacted patients/families. Laboratories should develop policies regarding re-analysis of variant classification, in consultation with healthcare providers, hospital authorities, and provincial ministries of health and/or regulatory bodies, and disseminate these. A useful overview is provided in Deignan et al.27

Tests and clinical issues

Appropriate test usage

When weighing the impact (costs and benefits) of a clinical assay on the healthcare system, the total cost of analysis should be considered, which may include reflex testing or cascade family testing including prenatal diagnosis. Although NGS has led to improved diagnostic success in comparison with conventional methods, an NGS assay may include newly emerging genes which have little related clinical knowledge or evidence, the reporting of which may result in a high number of variants of unknown clinical significance (VUS). The laboratory should use an evidence-based approach to identify and report variants only from those genes with sufficient evidence for involvement in the investigated disorder. As well, consideration should be given to the types of variants most likely to be detected given the clinical question and the appropriate methodology to detect these variants.

When assessing the cost of a genome-wide test, the assessment shall include consideration of the time involved in assessing/interpreting variants identified, including incidental findings, and the cost of both pre-test and post-test genetic counselling. In general, at this time, when there is a high clinical suspicion for a specific disorder with a well-defined phenotype or known set of relevant genes, analysis of a targeted set of genes is likely to provide a more cost-effective approach. In some cases, an in silico gene panel using a subset of data generated from sequencing of a larger genomic region may be an option. In this case, cost analyses should include the need for orthogonal testing to ensure adequate coverage of key genes in the case of a negative test.

Clinical interpretation and reporting NGS results

Interpretation and reporting of NGS results are not fundamentally different from other genetic results, and reporting should match best practice standards already in existence for other large-scale clinical tests such as microarray. Detailed information on the overall analysis should be available on inquiry, but test reports should aim to be concise. Variant reporting should conform to HGVS nomenclature and ensure that the reference transcripts and genome builds used are clearly documented.

The following information should be provided on the clinical report or available in other formats (eg, versioned information available online) for ordering healthcare providers, as applicable to the specific assay performed:

  • Targeted genomic regions meeting QC metrics;

  • Targeted genomic regions not meeting QC metrics;

  • Orthogonal methods used to augment regions in which QC metrics were not met;

  • Minimum read depth at which data were accepted for reporting;

  • Sensitivity of the assay, as determined by the minimum read depth;

  • Types of variants detectable (eg, SNVs, indels, CNVs);

  • Maximum size of insertion and deletion variants detectable;

  • Lowest variant allele frequency detectable by the assay (lower limit of detection);

  • Limitations, including those for detection of variants in homologous, repetitive or GC-rich regions.

Data sharing within the genetics community is encouraged, as this is critical for the continued understanding of novel variants found through diagnostic testing. When contributing variants to databases, laboratories should have an understanding of the curation, and comply with privacy legislation and regulations.

Professionals with final interpretation and reporting responsibilities for NGS shall have appropriate credentials and privileges for oversight of NGS testing and interpretation of NGS data. As NGS is an emerging technology impacting all areas of laboratory medicine, Canadian training programmes should consider including competencies and the training requirements for specialty-specific NGS applications into training programmes.

Incidental and secondary findings

Recommendation 26: Laboratories shall define and disseminate policies regarding identification and reporting of incidental or secondary findings.

NGS assays may identify incidental or secondary findings. The CCMG provides clear guidelines with respect to incidental findings in the context of inherited disorders.28

For acquired cancers, testing of tumour tissue for somatic variants important for drug therapies may detect germline cancer predisposition variants relevant to the patient’s cancer.29 In the absence of germline testing, definitive information about a potential germline variant cannot be stated from tumour-only analysis. Laboratories should define the results on tumour NGS tests that may indicate a germline variant is suspected and establish a policy on the follow-up approach for these cases. Follow-up may include contacting the referring physician to suggest a referral to clinical genetics, or a statement on the report that genetic counselling is recommended for the specific gene or variant.

Conclusions

We present in these guidelines recommendations for the use of NGS in Canadian clinical genetic laboratories. The guidelines encompass technical aspects, reporting issues and managing NGS data within the Canadian public healthcare system and with consideration to Canadian privacy legislation. The aim of defining and recommending Canadian guidelines for NGS is to provide national standards for clinical laboratories that are endorsed by the CCMG. We envision that these recommendations will serve as a reference for key issues that should be considered by Canadian practitioners using NGS services originating outside of Canada and provide a resource to Canadian laboratory accreditation bodies developing NGS standards.

Acknowledgments

The authors would like to thank the CCMG Board of Directors and CCMG members who reviewed the document and provided useful comments.

References

Footnotes

  • SH and TNN contributed equally.

  • Contributors TLS conceived the project, assembled the Ad Hoc Working Group and coordinated the group activities. SH, TNN, MS, EM, RA, HF, JP, DJS, ST and TLS contributed to document planning, participated in discussions, and wrote and reviewed document content. SH, TNN and TLS also reviewed comments from the CCMG membership, made revisions based on comments and performed an overall edit of the final document. All authors provided approval of the final version of the document.

  • Funding The authors would like to thank the Canadian College of Medical Geneticists for providing administrative support for document circulation among the membership.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval Approved by the CCMG Board of Directors: 16 January 2019.

  • Provenance and peer review Not commissioned; externally peer reviewed.