The vocabulary currently used to describe genetic variants and their consequences reflects many years of studying and discovering monogenic disease with high penetrance. With the recent rapid expansion of genetic testing brought about by wide availability of high-throughput massively parallel sequencing platforms, accurate variant interpretation has become a major issue. The vocabulary used to describe single genetic variants in silico, in vitro, in vivo and as a contributor to human disease uses terms in common, but the meaning is not necessarily shared across all these contexts. In the setting of cancer genetic tests, the added dimension of using data from genetic sequencing of tumour DNA to direct treatment is an additional source of confusion to those who are not experienced in cancer genetics. The language used to describe variants identified in cancer susceptibility genetic testing typically still reflects an outdated paradigm of Mendelian inheritance with dichotomous outcomes. Cancer is a common disease with complex genetic architecture; an improved lexicon is required to better communicate among scientists, clinicians and patients, the risks and implications of genetic variants detected. This review arises from a recognition of, and discussion about, inconsistencies in vocabulary usage by members of the ENIGMA international multidisciplinary consortium focused on variant classification in breast-ovarian cancer susceptibility genes. It sets out the vocabulary commonly used in genetic variant interpretation and reporting, and suggests a framework for a common vocabulary that may facilitate understanding and clarity in clinical reporting of germline genetic tests for cancer susceptibility.
- genetic variant
- cancer susceptibility
Statistics from Altmetric.com
The Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium is an international effort focused on determining the clinical significance of variants in breast-ovarian cancer genes. In addition, ENIGMA provides expert opinion to global classification and database initiatives, notably ClinGen (Clinical Genome Resource; https://www.clinicalgenome.org/), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) and the BRCA-Exchange (http://brcaexchange.org/). ENIGMA also explores optimal avenues of communication of such information at the provider and patient level. Importantly, most members (65%) conduct research and clinical activities in a language other than English (see online supplementary text).
ENIGMA research initially focused on improvement of methods to classify BRCA1 (MIM113705) and BRCA2 (MIM600185) variants associated with typical ‘high’ risk of cancer,1 with subsequent investigations identifying BRCA1/2 variants associated with demonstrably lower cancer risks.2 3 The inclusion of multicancer syndrome and novel breast-ovarian cancer susceptibility genes on research and commercial cancer gene panels has expanded the scope of ENIGMA investigations. Four consecutive ENIGMA consortium meetings have included dedicated time to discuss appropriate terminology for describing genetic variants, and their relationship to risk of different cancer types, and implications for clinical management. In particular, as genetic test ordering has moved outside the traditional hereditary cancer clinic setting into mainstream oncology, concern has been raised regarding misinterpretation of variant pathogenicity descriptions—even for well-characterised genes like BRCA1/2.4
ENIGMA members spanning all ENIGMA working groups have developed a document that provides an overview of different terms used in scientific and clinical reports, and by relevant international bodies, to describe various aspects of sequence variation in cancer predisposition genes. This exercise revealed alternative usage for many terms, interchangeable use of terms, and the potential for misinterpretation of the actionability of variants. We sought feedback from the general ENIGMA membership, by circulation of a draft discussion document and presentation at three consecutive consortium meetings, regarding their views on which terms may be most appropriate for promotion as preferred terminology in ENIGMA documentation, research projects and manuscripts. Discussions highlighted in particular the complexities of describing variant association with cancer risk in the context of multigene panel tests. Namely, that such tests may include genes for which ‘pathogenic’ variants are associated with varying levels of risk for different cancer types, and where, even for specific genes with well-established hereditary cancer risk profiles, some variants may be associated with altered cancer penetrance compared with the ‘average pathogenic’ variant for that gene. Different terms in use were considered by ENIGMA members attending the June 2018 Consortium Meeting, to reach consensus about the least ambiguous terms for clinical reporting. We provide some general recommendations for terminology to describe cancer susceptibility gene variation and its relationship to risk. We also propose a multitier structure for reporting cancer susceptibility variants, to improve the understanding of level of cancer risk associated with an identified variant and appropriate clinical actionability given patient presentation.
The need for standardised terminology and definitions for describing sequence variation, focused on inherited variants
Online supplementary table 1 summarises terms used to describe sequence variants, and their association with or relevance to disease, and to patient clinical management. The information was derived from a combination of knowledge from the literature, usage in verbal and written project reporting across ENIGMA, in clinical reports generated or viewed by ENIGMA members and documentation/terms described by the Human Variome Project (HVP; http://www.humanvariomeproject.org), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) and International Society for Gastrointestinal Hereditary Tumours (InSiGHT; https://www.insight-group.org/). The content was presented to ENIGMA members at several consecutive consortium meetings, and also circulated in document form, to invite feedback and additions. While not claiming to be an exhaustive list of terms and their meanings, it is clear that a single term/phrase can be used to describe different aspects relating to a variant (different intent), and that multiple terms can describe just one aspect (same usage). In some instances, differences in terminology appeared to depend on the field of research, and the context in which a variant is identified. Notably, the term ‘pathogenic variant’ is used to describe a germline disease-causing variant in a Mendelian disease gene classified according to criteria from the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP)5 or International Agency for Research on Cancer (IARC).6 It has also been described as a ’sequence variant that contributes mechanistically to disease but is not necessarily fully penetrant, that is, may not be sufficient in isolation to cause disease’ in the context of assessing support of disease causality of variants identified by high-throughput sequencing.7 Moreover, a germline ‘pathogenic variant’ considered causal for disease risk is commonly termed a ‘mutation’ in the historical and even current literature, and in the medical management (National Comprehensive Cancer Network, www.nccn.org; National Institute for Health and Care Excellence, https://www.nice.org.uk; EviQ, https://www.eviq.org.au) and research setting.8 However, ‘mutation’ can refer to any permanent change in DNA sequence (irrespective of frequency or disease-causing potential), and ‘mutation’ is used almost exclusively to describe somatic variation in the context of tumourigenesis. Indeed, the interface of the Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase9 describes variants for a specific gene using the term ‘mutation’, with additional qualifications, for example, for TP53 (MIM191170), the qualifiers include: deleterious, DNA binding domain, truncating. To add to the complexity, the Leiden Open Variation Database (LOVD) freeware database software,10 promoted widely for sharing and curation of (germline) disease gene variants, describes the equivalent of variant pathogenicity as ‘variant effect’. The most current version LOVD3 prescribes the terms ‘affects function’ instead of ‘pathogenic’, and the following terms for four other pathogenicity classes: ‘probably affects function’, ‘unknown (or effect on function not known)', ‘probably does not affect function (or probably no functional effect)' and ‘does not affect function (or no functional effect)'.
Furthermore, feedback from ENIGMA consortium members indicated there was varied perception of the level of risk association and clinical actionability for variants described as ‘benign’ or ‘not pathogenic’, terms put forward by the ACMG/AMP5 and IARC6 classification schemes, respectively, to indicate that a variant is not clinically actionable for patient management. Also, the distinction between a variant described as uncertain (ACMG/AMP and IARC—reviewed and insufficient or conflicting evidence regarding pathogenicity) versus unclassified (not yet assessed)11 was poorly recognised.
In addition, we separately documented terms used to describe output from some more commonly used bioinformatic prediction tools (table 1), since results from bioinformatic analysis are almost always included in clinical test reports. Such bioinformatic predictions are generally defined without reference to clinical information, are often binary and are intended to be included as one of several points of information used to arrive at a final variant classification. Nevertheless, we identified several possibilities for misinterpretation of bioinformatic output terms as a ‘final’ variant classification. The PolyPhen2 tool12 uses the term ‘benign’ to describe variants with no/little predicted effect on protein function—the same as the ACMG/AMP term for a variant that is not considered important for diagnosis/risk/patient management. Of greater concern, the term ‘deleterious’ is an output from multiple tools (CONDEL, LRT, Mutation Taster, Provean); this term is also used by the European Medicines Agency (http://www.ema.europa.eu/) and the US Food and Drug administration (https://www.fda.gov/) to denote eligibility of patients with specified cancer types/presentation for poly ADP ribose polymerase inhibitor therapy, namely patients with a ‘deleterious or suspected deleterious germline (or somatic) BRCA mutation’. Furthermore, the combined term ‘deleterious mutation’ is used (in addition to the term ‘pathogenic mutation’) by the NCCN 2018 guidelines (www.nccn.org) to describe genetic variation used to denote specific management recommendations for patients with familial breast-ovarian cancer. Without clarity of the use of these terms in context, there is significant risk of overinterpretation of bioinformatic data. Cancer genetic germline tests are increasingly being ordered by clinicians relatively unskilled in genetic terminology. A clear reporting language, with clear definitions of final variant interpretation summarising all the component information used for classification, is thus paramount to avoid variant misinterpretation and inappropriate patient management.
Proposed vocabulary to describe genetic variation in cancer predisposition genes
The terms discussed below primarily focus on describing germline variation in cancer genes, detected by genetic testing for diagnosis of hereditary cancer or estimating future cancer risk. However, the vocabulary inevitably overlaps terms used to describe somatic variation in tumours in the context of drug therapy selection for patients with cancer, or distinguishing true germline variants from variants arising from somatic clonal drift in ‘disease free’ tissue used for DNA extraction.13 14 These suggestions take into consideration terms put forward by the IARC unclassified sequence variants working group,6 ACMG/AMP5 and HVP,15 and a comprehensive review article assessing clinical implications of gene panel test results for breast cancer risk prediction.16 We have not addressed variant annotation in relation to predicting response to drug treatment. We refer readers to the Clinical Pharmacogenetics Implementation Consortium for consensus terms for reporting clinical pharmacogenetic results,17 and note that ClinVar currently supports the following terms describing variant effect relating to therapy: drug response, confers sensitivity. For any given variant, the term wild-type may be used to denote the nucleotide/s or amino acids in the selected reference DNA/protein sequence. However, this term can also be used to describe ‘normal’ phenotype, typically protein function/characteristics measured by in vitro assays.
The term variant should be used to define a DNA change that differs from a defined reference sequence, consistent with recommendations from the ACMG/AMP5 and HVP.15 Various descriptors of a variant depend on the context, as denoted below.
Cellular origin of variant
It is important to specify the tissue from which tested DNA has been derived, irrespective of the use of the descriptors below.
Constitutional or germline (used interchangeably): a sequence variant identified in DNA from a tissue type assumed to represent the DNA content of the fused germ cells (eg, blood), and therefore to be transmittable to offspring. This includes a sequence variant that arises de novo in a gamete and in this setting will be present in all cells of an individual but not inherited from one or other parent.
Mosaic: sequence variant that has arisen during embryogenesis and therefore not present in all the cells/tissues of an individual.
Somatically acquired (not inherited): sequence variant present only in a specific tissue. In the context of tumour DNA (tumour biopsy or circulating tumour DNA derived from blood), the variant will be present in tumour DNA and absent from DNA derived from other tissue/s of the same individual.
Somatically detected: sequence variant detected in a specific tissue type and for which somatic or germline origin has not yet been established by investigating DNA from other tissues. May be used for variation detected by tumour sequencing (tumour-detected), or in the context of suspected mosaicism. Somatically detected variants identified in DNA from blood/saliva with allele proportion <0.3, and/or in individuals with incompatible clinical presentation, are more likely to represent variation due to aberrant clonal expansion in hematopoietic cells (particularly TP53 13 14), or from circulating tumour DNA.
Nucleotide-level evolutionary conservation
Nucleotide sequence changes in coding regions are primarily assessed using protein-level conservation analysis that assesses their effect on protein sequence (see below). However, nucleotide-level conservation analysis may be considered useful for investigating effect of sequence changes on the fitness of splicing regulatory motifs, or mRNA secondary structure and stability, translation efficiency18–20 or to infer functional importance of non-coding sequences (introns, untranslated regions and other extragenic sequence). Indeed, it is a factor denoted for review of synonymous variants (code BP7) in the ACMG/AMP guidelines.5
Nucleotide substitutions analysed by evolutionary/phylogenetic methods involve alignment of at least three nucleic acid sequences, termed multiple (multispecies) sequence alignment (MSA). We suggest that such analysis specify the method/programme used, the number of ortholog sequences included and their phylogenetic relationship to humans. To our knowledge, there are no firm standards proposed for use of nucleotide-level evolutionary conservation in predicting whether a variant may affect fitness of difference sequence motifs (splicing, transcription factor binding, etc).
We thus suggest that nucleotide positions in the alignment may be described simply as:
Evolutionarily invariant: at the position of the variant, the MSA is identical across all species considered in the alignment.
Evolutionarily variant: at the position of the variant, the MSA is not identical across all species considered.
Scores provided by specific tools, eg, PhyloP,21 may be helpful to assess if a specific position is evolutionary constrained or not.22 Furthermore, position weight matrices23 developed for functionally important sequence motifs, eg, splice junctions24 may be useful to gauge the effect of a genetic variant on the fitness of that sequence motif.
Protein-level evolutionary conservation and bioinformatically predicted physicochemical characteristics of a missense alteration
As noted above (table 1), bioinformatic tools use a range of terms to describe results from analysis of a given predicted missense alteration. Protein-level conservation analysis is required to adequately capture redundancy in codon usage, and additional features considered include relative physicochemical properties of amino acids, and predicted effects on protein secondary, tertiary and quaternary structure. Without prescribing or recommending use of any particular tool/s for variant evaluation, we do recommend use of the following terms to describe output for analysis of missense substitutions (or small in-frame insertions/deletions) using evolutionary/phylogenetic methods. Depth of the analysis for a protein sequence alignment should be specified, including number of ortholog sequences in the protein multiple sequence alignment (PMSA), phylogenetic relationship of the species most evolutionarily distant to humans and the average number of substitutions per position.25
Variants should be described in relation to the level of evolutionary conservation for that amino acid position (residue) in the protein multiple sequence alignment (and noting that the non-human sequences included in the alignment should be wild-type (the form that occurs most frequently) and of a splice form matching the human reference sequence, insofar as possible).
Generic descriptors for an amino acid position (residue) in an alignment
Evolutionarily invariant: amino acid at that position in the PMSA is identical across all species considered.
Evolutionarily conserved: amino acids at that position in the PMSA have similar* physicochemical properties across all species considered.
Not evolutionarily conserved: amino acids at that position in the PMSA show marked differences* in physicochemical properties across the species considered.
*There are alternative methods to assess similarity and differences for substitutions at a given position in an MSA. The method should be defined for the specific analysis conducted. Examples include: Grantham variation (GV) is <60 (conserved) or ≥60 (not conserved); residue harbours an alternate amino acid with Grantham difference (GD) score <60 (conserved) or residue variation exceeds this limit (not conserved).26
Descriptors for an amino acid change relative to the sequence alignment
Outside the range of variation (observed evolutionarily): altered amino acid has markedly different physicochemical properties (defined by size, charge, etc) to the range of variation of those properties observed at its position in the PMSA. Note: this is relatively more likely to happen if the position is invariant or conserved.
Similar to the range of variation (observed evolutionarily): altered amino acid has similar physicochemical properties to the extremes observed for the range of variation of physicochemical properties at that position in the PMSA, for example, GV>0 and GD relatively small, say <30.
Inside the range of variation (observed evolutionarily): altered amino acid has physicochemical properties that clearly fall within the range of variation of those physicochemical properties observed at that position in the PMSA, for example, GV>0 and GD=0.
If the position of an amino acid variant in the PMSA is invariant or conserved, and the change is outside the range of variation, then it is considered evolutionarily unlikely. Conversely, an amino acid substitution that is within or similar to the range for variation observed evolutionarily, may be termed evolutionarily tolerated (if the alternative amino acid is already present in the alignment) or otherwise evolutionarily tolerable (if the alternative amino acid is not observed in the alignment, but similar to the range of variation observed).
As noted above, bioinformatic prediction of variant effect on function should not be used alone to infer association with measurable disease risk. However, variant effect/bioinformatic prediction scores, together with information on variant location in the gene relative to splicing motifs/functional domains, may be calibrated against clinical measures of variant pathogenicity (termed clinical calibration) to provide probability estimates useful to re-assign a variant as likely not pathogenic27–30 (see online supplementary text for more details).
Impact on mRNA transcript profile or protein function
We recommend ‘naturally occurring mRNA transcript’ be used to describe mature mRNA transcript/s seen in controls. Using mRNA transcription in control samples as reference, a variant may exhibit an altered mRNA transcript profile by: (i) impacting overall level of transcript/s (overall expression); (ii) resulting in novel mature mRNA transcript/s and/or (iii) altering the relative contribution of individual transcripts to the overall expression. Control mRNA should be from the same tissue type and analysed using the same methodology.
Variants assessed for effect on transcription via gene regulation, may be described as not impacting transcription levels, or impacting transcription levels. Impact on transcription can be further described as partial, or total (also termed transcriptional silencing). Epigenetic silencing specifically refers to impact on transcription via altered methylation profile.
Variants assessed for effect on mRNA transcript profiles via impact on mRNA splicing, including loss, gain or enhanced use of cryptic splicing motifs, may be described as follows:
Non-spliceogenic: the variant does not alter mRNA transcript profile.
Spliceogenic (predicted) LOF: the variant results in an altered mRNA transcript profile that is predicted to cause gene loss-of-function, that is, any combination of mRNA transcripts predicted non-coding, predicted protein truncating nonsense mediated decay (NMD) and/or predicted to encode proteins lacking critical structural/functional motifs.
Spliceogenic (predicted) functional: the variant results in an altered mRNA transcript profile that is predicted to preserve gene functionality, that is, any combination of mRNA transcripts which together will encode protein/s that is/are predicted to preserve functional capacity.
Spliceogenic uncertain function: the variant results in an altered mRNA transcript profile for which the coding/functional consequences are uncertain, that is, combinations of transcripts predicted to cause gene loss-of-function, retain gene function or to encode proteins with uncertain functional potential, for which the combined functional capacity is unclear.
Variants that have been analysed in functional (biochemical, biophysical, molecular biological) assays that assess variant effect on protein conformation/activity/function should compare effect (always specifying effect measured) to wild-type and other controls as follows:
No functional impact: variant displays features (specified) similar to wild-type.
Functional impact: variant alters features (specified) compared with wild-type. Impact may be described as:
Complete loss of function: variants with loss of function (feature to be specified) below a detection threshold or to a degree of the average pathogenic variant for that gene/protein.
Partial loss of function: variants with partial loss of function (feature to be specified), that is, intermediate between that of the wild-type protein sequence and the average pathogenic variant for that gene/protein. May alternatively be described as intermediate functional effect or hypomorphic.
Gain-of-function: term encompasses increase in a known function for that protein relative to wild-type, or gain of additional novel functions, for example, for p53,31 RET.32 May alternatively be described as neomorphic.
Dominant-negative: variant that encodes an altered protein that interferes with the function of the protein encoded by the wild-type allele. A common example is a variant encoding a protein that retains the ability to form protein-protein complexes, but disrupts the functionality of such complexes.
Note: a variant with measurable effect in vitro on mRNA transcript profile or protein function (specifying feature measured), relative to appropriate controls, should not a priori be assumed to be associated with disease risk. To include functional and mRNA data in gene-specific variant classification protocols, it is necessary that the association between magnitude of effect on mRNA profile/protein function and disease risk is first calibrated against clinical measures of variant pathogenicity, such that the range of variation in effect is established for variants previously classified as pathogenic, and for those considered not pathogenic. See de la Hoya et al and Colombo et al 33 34 for examples of calibration of BRCA1 and BRCA2 transcript levels.
Genetic variation and description of associated disease risk
Cancer risks associated with a genetic variant may be presented in a variety of different ways. Risk associated with a proven cancer-predisposing gene variant (type) can only be correctly interpreted if the time period and population to which the risk applies is defined.35 Most cancer predisposition genes exhibit organ-specific disease expressivity, so it is important to specify disease (phenotype), and mode of inheritance. A given variant may confer different disease risks for heterozygote versus compound heterozygote or homozygote carriers.
Absolute or cumulative risk is the likelihood that a person with a cancer-predisposing variant will develop a given cancer within a period of time, for example, within the next 5 or 10 years, or by a specific age. It is expressed as a percentage.
Relative risk compares the cancer risk for genetic variant carriers relative to the risk for non-carriers or the general population, and can be estimated through several study designs, for example, case-control studies estimate odds ratios, cohort studies estimate rate ratios.
Disease penetrance is typically used to describe the overall probability that carriers of cancer-predisposing variants in a given gene (sometimes specifying a specific variant type) will develop specified cancer type/s until a specified age or during lifetime. For a fully penetrant genetic variant (or variant type), disease will develop in all individuals with the variant (type). Reduced penetrance may be used to describe a variant that displays lower penetrance compared with risk-associated variants typically identified for that disease gene. The estimated level and type of disease risk/s associated with a reduced penetrant variant determine whether carrier status may be used to inform clinical management.
We suggest that it is helpful to present variant-associated risks to patients as both an absolute measure (eg, 50 in every 100 people with this variant (type) are expected to develop breast or ovarian cancer by age 70 years) and a relative measure (eg, a variant carrier is 10 times more likely to develop breast cancer in their lifetime compared with women in the general population), and report these with appropriate confidence intervals (). Based on descriptors applied previously for breast cancer,16 for this discussion document we have categorised cancer risk levels associated with a given variant, relative to the general population risk, as follows: high increased risk, more than fourfold; moderate increased risk, twofold to fourfold; low increased risk, greater than unity and less than twofold. Relative risks are not clinically useful without knowing the absolute risk of a disease—a relative risk of four for a rare disease is still a small risk. A high relative risk is not necessarily a high absolute risk because the latter depends on the baseline population risk. Thus, for cancer types that are uncommon in the population, the absolute risk, and also the availability of interventions, have to be considered when determining the clinical actionability of a variant. Note: the term ‘intermediate’ requires reference values to define its level (for relative or absolute risk), and is thus considered non-specific for the purpose of variant reporting.
The term risk allele may be used as an alternative to describe a variant identified as cancer-associated, generally using case-control analysis such as genome-wide association studies, where there is not necessarily a mechanistic relationship between a ‘lead’ variant in a linkage disequilibrium block and disease predisposition.
Proposed vocabulary to describe clinical relevance of genetic variation in known or suspected cancer predisposition genes using a five-tier system
The IARC five-tier variant classification system was developed to promote use of probability-based methods for variant classification of highly penetrant cancer susceptibility genes that could then be specifically linked to recommended clinical management protocols.6 This system has been adopted by the InSiGHT group for mismatch repair (MMR) gene variant classification,36 and by ENIGMA for BRCA1/2 variant classification (https://enigmaconsortium.org). It is used for ClinGen-approved expert panel curation of variants in these genes, displayed in ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) and selected public locus-specific databases (https://www.insight-group.org/variants/; http://brcaexchange.org/). The IARC tier terminology and management recommendations as published in 2008 are broadly consistent with those recommended by ACMG/AMP (table 2). However, assigning terms for the variant tiers across different public portals has highlighted differences in the wording used to describe the IARC class 2 and class 1 tiers, and potential for misinterpreting the clinical relevance of individual variants based on current IARC or ACMG/AMP terms. Indeed, misinterpretation of the class 1 tier has been raised in relation to the BRCA2 c.9976A>T p.Lys3326Ter variant associated with <1.5-fold increased risk of breast or ovarian cancer,37 both publicly,38 39 and by direct query to the BRCA-Exchange website (http://brcaexchange.org/). The latter led to a change in representation of this tier as ‘benign’ to ‘benign-little clinical significance’ on the BRCA-Exchange website.
Furthermore, during development of the ENIGMA BRCA1/2 variant classification criteria (https://enigmaconsortium.org), research results emphasised the need for clear statements about appropriate class assignment for variants with proven association with so-called ‘intermediate’ or ‘moderate’ increased risk of cancer. Specifically, discovery that the BRCA1 c.5096G>A p.Arg1699Gln variant demonstrates reduced disease penetrance relative to ‘high-risk’ truncating BRCA1 variants raised the issue of how to denote such reduced penetrance variants in the five-tier system, in particular if the disease penetrance was sufficient to trigger altered management, although not as extensive as the ‘standard pathogenic’ variant for that gene.3 The advent of multigene panel testing that encompasses so-called ‘moderate-risk genes’ has further highlighted the complexities of trying to develop and implement simple terms to describe the disease risk and clinical relevance of variants where risk by variant type can differ between and within genes. Indeed, circulation and discussion of the ENIGMA terminology highlighted ‘pathogenic’ as the term for which the definition was most contentious.
In an attempt to address all the above issues, we considered usability of terms in research publications, inconsistencies in wording for the IARC class 1 and 26 and alignment with terminology recommended by the ACMG/AMP guidelines.5 We also considered relevant definitions from several English dictionaries, and the derivation of the word (see online supplementary text)—this being an important component of translating meaning of terms by collaborators for whom English is not the first language.
During the ENIGMA meetings held on January 2017, September 2017 and June 2018, the ENIGMA membership have been presented with various options for describing or rewording terms, with more detailed descriptions of each of the five tiers intended to capture the complexity of reporting in the multigene panel testing era. Discussions arising from these presentations, and additional commentary on documentation circulated to members, has resulted in the recommendations and summary descriptions shown in table 3. We anticipate that this more detailed description of the clinical implications of, and management recommendations associated with, germline variants placed in each of the classification tiers will provide a short-term solution to improve understanding of these terms in the context of clinical reporting of cancer predisposition variants using a five-tier classification system. Adaptation for other Mendelian or co-dominant disease genes is possible, subject to clear definition of level of disease risk associated with clinical actionability, and other factors to be considered when establishing absolute risk at the individual level.
Proposal for development of a multitier system for variant annotation in clinical test reporting of multigene panel results
Despite the expansion of descriptions for the five-tier variant classification system shown in table 3, it was clear from comments received that assignment of variant pathogenicity using the current five-tier system is inadequate to deal with the complexities of reporting multigene panel testing outcomes, and to portray differences in variant-specific risks for a given gene. The term ‘pathogenic’ remained contentious, with comments raised by ENIGMA members including: need to capture the relevance of genetic findings to patient disease diagnosis (phenotype) versus relevance of a secondary finding (ie, outside of the patient diagnosis); reporting variant effect for recessive as well as dominant disease and whether a variant could be termed ‘pathogenic’ on the background of a polygenic risk score that reduced individual risk to the population level. These observations indicate a need for a more consistent approach to variant reporting for clinical use, to minimise ambiguity of clinical management considerations. We thus developed a template to emphasise the value of a multitier reporting system (outlined in table 4), and provide several worked examples (online supplementary table 2) to indicate its potential to capture the complexity of clinical actionability for variants identified by multigene cancer panel testing. The intention is that clinical inferences should be added to specific variant interpretation/classification, requiring the report to capture the level of (un)certainty around risk estimates and the contribution of an individual reported variant to a composite risk score. This could then be linked to clinical discussions about potential interventions, with particular value for multigene panel reporting.
Conclusions and future directions
Our international consortium experience has highlighted that many terms used to describe genetic variants have multiple meanings, so that terms may be used interchangeably with the potential for false inferences in different contexts. Variant descriptor output from bioinformatics tools has potential to lead to patient mismanagement if directly transferred into clinical reports without clear explanation. Furthermore, there is considerable debate regarding use of terms to describe risk association and relevance to clinical management, with particular contention around the term pathogenic and relationship with patient medical management. We summarise the key points and provide recommendations on variant annotation and terminology in box 1. We also propose a framework for describing variants using a vocabulary that may be incorporated into clinical laboratory reporting. If adopted this approach should lead to more consistent variant interpretation at the laboratory level (ACMG/IARC), and importantly, allow clinical reports to clearly capture the relevance of a variant (or combination of variants) for the intended healthcare application. We recognise that practical implementation of such a system would require routine input from appropriately trained clinicians before a test report is issued for discussion with the patient. By no means intended as a final product, we present this for discussion and further development with the broader clinical community worldwide.
Key recommendations regarding variant descriptors and their use in variant classification and clinical reporting*
The term variant should be used to define a DNA change that differs from a defined reference sequence.
It is important to always specify the tissue from which tested DNA has been derived.
Bioinformatic prediction of variant effect on function should not be used alone to infer association with measurable disease risk.
Bioinformatic prediction scores, together with information on variant location in the gene relative to splicing motifs/functional domains, may be calibrated against clinical measures of variant pathogenicity (termed clinical calibration) to provide probability estimates useful to re-assign a variant as likely not pathogenic.
The term spliceogenic is used generically to describe a variant that results in altered mRNA transcript profile (relative to a reference), without consideration of transcript/s susceptibility to NMD, ability to encode functional protein or association with disease risk.
Variants analysed in functional assays (biochemical, biophysical, molecular biological, cellular) that assess variant effect on protein conformation/stability/activity/function should describe effect compared with wild-type and other controls, and always specify the protein effect measured.
A variant with measurable effect in vitro on mRNA transcript profile or protein function (specifying feature measured), relative to appropriate controls, should not a priori be assumed to be associated with quantifiable disease risk.
It is critical to specify disease/phenotype and mode of inheritance when providing a pathogenicity assertion for a genetic variant.
Present variant-associated risks as an absolute measure, and a relative measure, and report these with appropriate CIs.
*Variant annotation is a broad term used in the context of next-generation sequencing bioinformatic pipelines to describe the process of assigning a variety of descriptors to a given sequence variant, but these annotations are largely distinct from the clinically focused variant terminology denoted above (see online supplementary text for an overview of variant annotation).
The authors would like to thank ENIGMA collaborators for helpful feedback provided at ENIGMA general meetings, and other verbal discussion.
Contributors ABS, DME conceived and implemented the study as presented in this final form. ABS, S-GH, DME provided initial versions of tables and worked examples for review by remaining authors. ABS, SG-H, ACA, MB, LB, MdlH, SD, TD, HVF, ANM, ARM, MTP, PR, MR, MT, ET, CT, MV, LCW, ST, DME all provided content relevant to their expertise, and the text was circulated over multiple iterations to reach consensus. ABS and DME collated text and tables to form the first draft of the manuscript, and all authors approved the final manuscript.
Funding ABS is supported by an Australian National Health and Medical Research Council (NHMRC) Senior Research Fellowship (ID1061779). SG-H is supported by a research fellowshipfrom the Health Education England Genomics Education Programme (HEE GEP). ACA is supported by Cancer Research UK (C12292/A20861). MB and LB have been supported by the Australian NHMRC (ID1104808) and the Cancer Council Queensland (ID1044008, ID1026095). SD is supported by the Breast Cancer Research Foundation and Komen. MdlH is supported by funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 634935 (BRIDGES), and by Spanish Instituto de Salud Carlos III (ISCIII) funding (grants PI15/00059), an initiative of the Spanish Ministry of Economy and Innovation partially supported by European Regional Development FEDER Funds. ANM was supported by NIH/NCI grant CA116167. MTP was supported in part by NHMRC (ID1101400). MR was funded in part through an NIH/NCI Cancer Center Support Grant P30 CA008748. PR was partially supported by the Italian Association for Cancer Research (AIRC; IG 15547). ET was supported by the Australian NHMRC (ID1104808). MPGV is supported by the Dutch Cancer Society KWF (UL2012-5649) and KWF-Pink Ribbon Research Project 11704. LCW is supported by a Rutherford Discovery Fellowship (Royal Society of New Zealand).
Competing interests MR discloses the following support: Honoraria (Advisory) from AstraZeneca, Pfizer; Consulting or Advisory from McKesson, AstraZeneca; Research Funding from AstraZeneca (Institution), Myriad (Institution, in-kind), Invitae (Institution, in-kind), Pfizer (institution), AbbVie (institution), Tesaro (institution), Medivation (Institution); Travel, Accommodation, Expenses from AstraZeneca. SD discloses the following support: Honoraria (Advisory) from AstraZeneca, Clovis and Bristol-Myers Squibb.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement This is not an original research article.
Patient consent for publication Not required.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.