Background PALB2 monoallelic loss-of-function germ-line variants confer a breast cancer risk comparable to the average BRCA2 pathogenic variant. Recommendations for risk reduction strategies in carriers are similar. Elaborating robust criteria to identify loss-of-function variants in PALB2—without incurring overprediction—is thus of paramount clinical relevance. Towards this aim, we have performed a comprehensive characterisation of alternative splicing in PALB2, analysing its relevance for the classification of truncating and splice site variants according to the 2015 American College of Medical Genetics and Genomics-Association for Molecular Pathology guidelines.
Methods Alternative splicing was characterised in RNAs extracted from blood, breast and fimbriae/ovary-related human specimens (n=112). RNAseq, RT-PCR/CE and CloneSeq experiments were performed by five contributing laboratories. Centralised revision/curation was performed to assure high-quality annotations. Additional splicing analyses were performed in PALB2 c.212–1G>A, c.1684+1G>A, c.2748+2T>G, c.3113+5G>A, c.3350+1G>A, c.3350+4A>C and c.3350+5G>A carriers. The impact of the findings on PVS1 status was evaluated for truncating and splice site variant.
Results We identified 88 naturally occurring alternative splicing events (81 newly described), including 4 in-frame events predicted relevant to evaluate PVS1 status of splice site variants. We did not identify tissue-specific alternate gene transcripts in breast or ovarian-related samples, supporting the clinical relevance of blood-based splicing studies.
Conclusions PVS1 is not necessarily warranted for splice site variants targeting four PALB2 acceptor sites (exons 2, 5, 7 and 10). As a result, rare variants at these splice sites cannot be assumed pathogenic/likely pathogenic without further evidences. Our study puts a warning in up to five PALB2 genetic variants that are currently reported as pathogenic/likely pathogenic in ClinVar.
- variant classification
- acmg-amp guidelines
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Monoallelic loss-of-function (LoF) germ-line variants in PALB2 predispose to breast cancer, with estimated absolute risks by age 80 ranging from 33% to 58%, depending on the family history.1 2 Excess risk for other cancers, such as pancreas, prostate, ovarian and male breast cancer, is still under investigation. Currently, gene panel testing for breast cancer predisposition includes PALB2,2 and LoF germ-line variants in this gene are considered actionable findings in many settings, with proposed actions ranging from increased surveillance to prophylactic surgery.3–5 Accordingly, classifying PALB2 LoF variants is of paramount clinical relevance. Yet, the task is not trivial, as proved by the large number of variants of uncertain significance still existing in genes that have been extensively studied, such as BRCA1 or BRCA2.6
In the research setting, truncating (nonsense or frameshift) variants predicted to induce nonsense-mediated decay (PTC-NMD variants) and canonical ±1,2 splice site variants (hereafter named splice site variants) at cancer predisposition genes are often assumed pathogenic/likely pathogenic LoF variants.1 2 However, in the clinical setting a more conservative approach is recommended. According to the American College of Medical Genetics and Genomics-Association for Molecular Pathology (ACMG-AMP) interpretation guidelines,7 a PTC-NMD or splice site variant is a very strong evidence of pathogenicity (PVS1), but not sufficient to classify the variant as pathogenic/likely pathogenic. Additional combinations of strong (PS), moderate (PM) and/or supporting (PP) evidence of pathogenicity are required. Furthermore, PVS1 is not warranted for every PTC-NMD/splice site variant. Indeed, the ACMG-AMP-2015 guidelines specify several caveats, including the possibility of: (i) rescue transcripts (alternate gene transcripts that skip the truncating variant, encoding functional or partially functional proteins and resulting in reduced or no haploinsufficiency), (ii) splice site variants producing transcripts with in-frame deletions/insertions retaining some or all functional capacity and (iii) tissue-specific alternate gene transcripts.7 Therefore, the accurate interpretation of PALB2 PTC-NMD and splice site variants according to the ACMG-AMP-2015 guidelines requires reliable information on both protein structure/function and alternative splicing.
To be more precise, PALB2 PTC-NMD/splice site variants without direct risk estimates and/or functional data (a common scenario in genetic testing) should be classified as likely pathogenic only if PVS1 is warranted. For PTC-NMD variants, PVS1 is warranted if no rescue transcripts are predicted. For splice site variants the analysis is more complex. In addition to rescue transcripts, the possibility of the variant allele producing transcripts with in-frame alterations retaining coding potential should be considered, although predicting the precise nature of the transcripts produced by a splice site variant is challenging.
In recent years, the Evidence-based Network for the Interpretation of Germ-line Mutant Alleles (ENIGMA consortium) has conducted a comprehensive characterisation of naturally occurring alternate gene transcripts in BRCA1 and BRCA2,8 9 exploring the impact of the findings for the clinical classification of genetic variants at the two loci. Major achievements were the identification of a subset of splice sites variants for which PVS1 was not necessarily warranted, the posterior demonstration that at least one allele containing a splice site variant, BRCA1 c.[594-2A>C; 641A>G], does not increase breast cancer risk and the observation that splicing assays may lead to erroneous clinical conclusions if alternate gene transcripts are not properly addressed.8–11 Recommendations based on these studies are documented in the ENIGMA BRCA1/2 Gene Variant Classification Criteria (https://enigmaconsortium.org) that support BRCA1 and BRCA2 expert panel review interpretation at ClinVar.
A recent study has identified alternate gene transcripts at the PALB2 locus, but no inferences in relation to the clinical interpretation of genetic variants were made.12 Here, we undertake a comprehensive characterisation of PALB2 alternative splicing, exploring the possible relevance of the findings for the clinical classification of PTC-NMD and splice site variants according to the ACMG-AMP-2015 guidelines.
Identification of alternative splicing events
To characterise alternative splicing at the PALB2 locus, we analysed RNAs isolated from 112 specimens, including lymphoblastic cell lines not treated with the NMD-inhibitor puromycin (n=68), matched replicates treated with puromycin (LCLs+Puro, n=1), stimulated leucocytes cultures not treated with puromycin (n=6), matched replicates treated with puromycin (sLEU+Puro, n=3), RNA stabilised peripheral blood samples (PAXgene, QIAGEN, n=7; Tempus, Thermo Fisher, n=10), non-malignant breast tissue samples from unrelated women (Breast, n=12; 10 corresponding to women with a diagnosis of breast cancer, of which 9 are included in SCAN-B, ClinicalTrials.gov identifier: NCT02306096; 2 corresponding to women without a diagnosis of breast cancer included in CASOHAR trial NTC02560818), a human mammary epithelial cell (HMEC, n=1, 2 technical replicas included in the analysis), commercially available RNA from non-malignant breast tissue (Clontech 636576, n=1), normal ovarian fimbriae tissue samples from prophylactic oophorectomies performed in postmenopausal women without cancer (Fimbriae, n=2) and one pool of 3 non-malignant ovarian tissues (Clontech 636555, n=1).
Experiments were performed independently in five ENIGMA laboratories (figure 1). Most samples were analysed by targeted RNAseq (n=72) in laboratory 1 (online supplementary table 1 and 2). Other samples were analysed by whole transcriptome RNAseq (n=13) in laboratories 2 and 3 (online supplementary table 1 and 2), by capillary electrophoresis of RT-PCR products (RT-PCR/CE, n=22) in laboratory 4 (online supplementary table 1, 2, 3 and figures 1A, B), and by whole-gene CloneSeq splicing analysis (n=5) in laboratory 5 (online supplementary figure 1B). We later performed a centralised revision/curation of the data, including the search for putative tissue-specific alternate gene transcripts. To this end, we pooled together all data produced in LCLs±Puro, sLEU±Puro, PAXgene and Tempus samples (hereafter referred collectively as BLOOD), all data produced in non-malignant breast tissues, HMEC and Clontech 636 576 (hereafter referred as BREAST) and all data produced in non-malignant ovarian fimbriae and Clontech 636 555 (hereafter referred as OVARY). The overall workflow is summarised in figure 1 (see online supplementary section 1 for further details).
Supplementary file 4
Annotation of alternative splicing events
We described all alternative splicing events according to HGVS guidelines, using as a reference the Ensembl transcript ENST00000261584.8 (NCBI RefSeq NM_024675.3). For the sake of simplicity, we also identified most events with a code that combines the following symbols: ∆ (skipping of reference exonic sequences), ▼ (inclusion of reference intronic sequences), E (exon), I (intron), p (acceptor shift), q (donor shift), AFE (alternative first exon) and IVS± (located at intervening sequence). When necessary, the exact number of nucleotides skipped (or retained) is indicated. Events were annotated as well according to the confidence of the finding (high-confidence vs lower-confidence), predictions on coding potential (LoF vs uncertain) and relative quantification (expression level relative to the corresponding reference transcript) (see online supplementary material section 2 and figures 2-5 for further details).
Analysis of PVS1 status (warranted vs not warranted) for every possible PTC-NMD and splice site variant at the PALB2 locus
To decide if PVS1 is warranted we used predictions based on: (i) the identification of alternate gene transcripts in control samples, (ii) RNA splicing assays performed previously in carriers of PALB2 splice site variants (online supplementary table 4) and (iii) novel RNA splicing assays (online supplementary table 4, figures 6A, B and C). In brief, we consider PVS1 warranted for PTC-NMD variants only if no plausible rescue transcripts have been detected. Similarly, we consider PVS1 warranted for splice site variants only if all predicted RNA product are bona fide LoF transcripts. To predict possible RNA products, we used splicing assays performed in carriers of splice site variants (assuming that other PALB2 splice site variants targeting the same splicing site will produce similar transcripts). If no splicing assay was available for a particular splice site, we based predictions on alternate gene transcripts, as previously done for BRCA1 and BRCA2.9 10 Further details are shown in online supplementary material section 3 and table 4.
We used RNA extracted from different human biological samples (blood-derived, breast and ovary; see ’Methods' section) to characterise naturally occurring alternative splicing at the PALB2 locus. This study combined targeted RNAseq, whole transcriptome RNAseq, RT-PCR/CE and whole-gene CloneSeq splicing analysis data that was independently produced at five contributing centres (figure 1). The analysis identified 44 naturally occurring alternative splicing events with high-confidence (online supplementary table 1) and provided evidence for the existence of up to 44 additional (lower-confidence events, online supplementary table 2 and supplemental material section 2.2). Most events (37 out of 44 high-confidence and all lower-confidence events) have not been described previously in GENCODE (https://www.gencodegenes.org/) or the scientific literature to our knowledge.
Supplementary file 1
Supplementary file 2
Up to 15 high-confidence events preserved a bona fide open reading frame (ie, an ORF spanning from the reference start codon to the reference termination codon, table 1, protein column). Of these, nine were predicted to code for non-functional proteins, and the remaining six for proteins of uncertain functionality (table 1, coding potential column). Twenty-nine high-confidence events did not preserve a bona fide ORF. All of them were predicted to code for non-functional proteins (table 2).
Targeted RNAseq data (online supplemental table 1, laboratory 1) indicated that most high-confidence events make on average (n=72 samples) a minor contribution to the expression level (ie, reads supporting the splicing event representing ≤1% of the reads supporting the corresponding reference transcript). The only exceptions were ∆(E1q17), IVS1-463▼(134), ∆(E7p10), ∆(E11), ∆(E11_E12) and ∆(E12), with contributions of ≈2%, ≈5%, ≈1.4%, ≈2%, ≈2% and ≈13%, respectively. In silico analysis suggests that events contributing >1% might be related to the presence of suboptimal splice sites at the PALB2 gene (online supplemental figure 7), with ∆(E12) contribution (≈13%) probably explained by the intrinsically weak exon 12 GC donor site.13 The relatively elevated level of alternative splicing resulting in skipping of exons 11 and/or 12 is supported by targeted and whole transcriptome RNAseq (online supplemental table 1), semi-quantitative RT-PCR/CE analysis (online supplemental figure 1A), whole-gene CloneSeq splicing analysis (online supplemental figure 1B) and quantitative dPCR (online supplemental figure 5B). According to the latter, ≈8%–34% of the PALB2 transcripts (depending on the sample analysed) may skip exon 11, exon 12 or both.
Overall coverage in whole transcriptome RNAseq was substantially lower than in targeted RNAseq experiments (figure 1). As a result, several events representing ≤1% of the targeted RNAseq reads were not detected by this approach. Only one major discrepancy was observed related to PALB2 Δ(E4_E5), which represented ≤1% of the corresponding reference signal in targeted RNAseq and whole-exon GenClone experiments, but >5% in RNAseq data generated by laboratory 3. However, subsequent digital PCR quantification in BLOOD, BREAST and OVARY confirmed that Δ(E4_E5) represents, on average, ≤1% of the corresponding reference signal (online supplemental figure 5).
Despite the lower coverage, whole transcriptome RNAseq and/or RT-PCR/CE experiments allowed us to detect 50 splicing events in BREAST, and 29 in OVARY. Of these, 24 splicing events—among them ∆(E1q17), IVS-463▼(134), ∆(E7p10), ∆(E11), ∆(E11_E12) and ∆(E12)—were detected in both tissues (table 1 and online supplemental table 1). Equally relevant, we did not identify tissue-specific PALB2 alternate gene transcripts (neither in BREAST nor in OVARY), suggesting that if they exist, they are expressed at very low levels—supporting the clinical relevance of BLOOD-based PALB2 splicing studies.
Finally, we used data on alternate gene transcripts to analyse if PVS1 is warranted for all possible PTC-NMD/splice site variants at the PALB2 gene. In brief, we concluded that PVS1 is warranted for every possible PTC-NMD variant, regardless of the location, that is, we have not identified any plausible rescue transcript (see ’Discussion' section). By contrast, we conclude that PVS1 is not necessarily warranted for every possible splice site variant. To be more precise, we propose that PVS1 may not be warranted for splice site variants located at the acceptor sites of exons 2, 5, 7 and 10. For this subset of splice site variants, the production of RNA transcripts retaining some or all functional capacity is plausible (see table 3 for further details). If splicing assays and/or clinical data supporting pathogenicity are lacking, we recommend caution when classifying splice site variants at these specific sites, that is, such variants should not be assumed pathogenic/likely pathogenic.
Alternative splicing probably occurs in all metazoan organisms, and increasing prevalence has been linked to phenotypic complexity.14 Virtually all human multiexon loci produce alternate gene transcripts.15 Apart from a presumed role in expanding protein diversity16 that is currently under dispute,17 18 some authors have suggested that alternative splicing may buffer mutational consequences.19 The latter possibility has obvious implications for the clinical interpretation of genetic testing results. The ACMG-AMP-2015 guidelines acknowledge this by recommending caution about overinterpreting the impact of PTC-NMD and splice site variants if multiple transcripts are present.7 Here, we have addressed this relevant aspect of alternative splicing for the particular case of classifying genetic variants at the breast cancer predisposition gene PALB2.
Alternative splicing analysis might be influenced by many factors, including collection of RNA samples, experimental design and detection sensitivity. For instance, one study characterising alternative splicing at breast cancer susceptibility genes by RNAseq noticed the poor performance of PAXgene if compared with LCL samples,12 and a previous ENIGMA collaborative study comparing RT-PCR splicing protocols across different laboratories concluded that primers design and detection sensitivity (rather than RNA extraction and/or cDNA synthesis protocols) had an impact on the analytical outcome.20 A strength of our study design was the application of different assay designs, RNA samples and subsequent levels of sensitivity and/or filtering, by five independent laboratories to identify PALB2 alternative splicing events (see online supplementary material section 1 for further details). We elected to define high-confidence splicing events as those found in at least two different data sets (the rationale being that events detected by a minimum of two laboratories, two sample types and two methodologies are very unlikely to represent technical artefacts and/or biological outliers), but acknowledge that such definition may lead to exclusion of real events found by a single laboratory. A higher stringency of high-confidence splicing events found by more than two laboratories was not used due to differences in the level of sensitivity between assays.
Overall, we identified 44 high-confidence alternative splicing events at the PALB2 locus, and we provide evidence for 44 additional events (although we cannot discard the possibility that some of the latter represent technical artefacts and/or biological outliers). Interestingly, all PALB2 reference exons are affected by one or more high-confidence alternative splicing events, suggesting that no PALB2 exon should be annotated as constitutive. Despite the considerable number of alternative splicing events identified, our data suggest that their contribution to the overall PALB2 expression is low in all three tissues investigated. Splice site and PTC-NMD variants in cancer susceptibility genes can be overinterpreted (misinterpreted as pathogenic), if alternate gene transcripts are not properly considered.7 10 11 21–23 In the past, this has led to errors in the clinical management of families carrying the BRCA1 allele c.[594-2A>C; 641A>G].23 The low level of alternative splicing observed for PALB2 in BLOOD, BREAST and OVARY suggests that overinterpreting genetic variants at this locus is less likely to occur. However, some of the alternative splicing events we report can be relevant for the clinical interpretation of PALB2 PTC-NMD and splice site variants, in particular to decide if PVS1 is warranted.
PTC-NMD variants: the existence of rescue transcripts reducing or eliminating the functional and clinical impact of certain PTC-NMD variants in cancer susceptibility genes has been confirmed for APC 22 and BRCA1.11 More specifically, the alternate gene transcript APC Δ(E9p303) explains the association of PTC-NMD variants located at codons 312–412 with mild disease,22 and the alternate gene transcript BRCA1 Δ(E9_E10) explains the low breast cancer risk observed in carriers of the splice site variant BRCA1 c.594-2A>C.11 However, we have not identified plausible rescue transcripts for PALB2. Alternate gene transcripts Δ(E2p6), Δ(E6), Δ(E5p24) and Δ(E10p3) might code for functional or partially functional proteins, but their respective contribution to the overall PALB2 expression (<1%) is too low to be plausible rescue transcripts. By contrast, the combined expression of Δ(E11_E12) and Δ(E12) might represent 8%–34% of the overall gene expression (depending on samples and methodologies), but the predicted proteins encoded by these two transcripts (table 1) are unlike to be functional, as they lack part of the C-terminal WD40 β-propeller domain (online supplementary material section 2.3) that mediates PALB2 interaction with several key homologous recombination proteins, including BRCA2 and RAD51.24 For that reason, we do not consider Δ(E11_E12) and Δ(E12) plausible rescue transcripts, although we cannot rule out the possibility of truncating variants in exons 11 and/or 12 conferring lower cancer risk than truncating variants in other PALB2 exons.
Canonical ±1,2 splice site variants: we propose that naturally occurring alternate gene transcripts provide predictive information identifying seven PALB2 canonical splice sites for which, in absence of splicing assays, PVS1 is not warranted (variants targeting exons 2, 5, 7 and 10 acceptor sites). For exon 2 acceptor site, the proposal is based on experimental data obtained in a PALB2 c.49–1G>A (IVS1-1G>A) carrier indicating upregulation of ∆(E2p6) (Dr Georgios Tsaousis, Genekor Medical, personal communication, June 2018). The possibility that ∆(E2p6) code for a functional/partially functional protein cannot be discarded (see online supplementary material section 2.3), supporting our conservative stance. For the remaining splice sites, we hypothesise that naturally occurring alternate gene transcripts (even if lowly expressed in control samples) may become upregulated if splice site variants impair the expression of reference transcripts. The hypothesis is supported by several observations made in carriers of PALB2 (among them, the upregulation of ∆(E2p6) in c.49–1G>A carriers), BRCA1 and BRCA2 splice site variants (see online supplementary table 4). Note that we propose that PVS1 is not warranted for splice site variants if at least one RNA product with uncertain coding potential is predicted, regardless of other predictions. For instance, we propose that PVS1 is not warranted for variants targeting the PALB2 exon 7 acceptor site because one RNA product of uncertain coding potential, ▼(E7p42), is predicted (table 3), despite the fact that up to five bona fide LoF transcripts are also predicted (▼(E7p20), Δ(E7p2), Δ(E7p10), ∆(E7p25) and Δ(E7)). When classifying splice site variants in high-risk breast cancer genes as pathogenic/likely pathogenic without functional or genetic data, we favour a very conservative approach. We have identified 43 different PALB2 splice site variants in ClinVar (last accessed 13 April 2018), all of them reported as pathogenic/likely pathogenic. For four of these variants, we think that the pathogenic/likely pathogenic classification may not be justified without considering additional clinical and/or splicing data (table 4).
In short, we highlight the fact that, where alternate gene transcripts exist, assertions of pathogenicity are warranted only with the support of additional quantitative splicing assays, and preferably clinical evidence.
Supplementary file 3
The authors would like to thank A Leconte at cancer center, F Baclesse and to Dr C Baudouin at Polyclinique du Parc, Caen (France) for their participation to CASOHAR clinical trial to obtain breast tissue from healthy volunteers. The authors would like to thank Heather Thorne, Eveline Niedermayr, all the kConFab research nurses and staff, the heads and staff of the Family Cancer Clinics and the Clinical Follow-Up Study (which has received funding from the NHMRC, the National Breast Cancer Foundation, Cancer Australia and the National Institute of Health [USA]) for their contributions to this resource, and the many families who contribute to kConFab. The authors would like to thank the SCAN-B collaborators at participating hospitals for support of SCAN-B. The authors would also like to thank Ingrid Hedenfalk and staff at the collaborating gynaecology and pathology clinics in Lund for providing the fimbriae samples.
SK and MdlH are joint senior authors.
IL-P and RL contributed equally.
Contributors IL-P, RL, RB, VL, JFP, LC, AM, DV, NG, GD, PG, VG-B, PLl, PP-S, ED-R, TC, KSH, VH, SW, TP, RK, JV-C, AV-P and MS, contributed to data acquisition, revised the manuscript for important intellectual content and approved the final version. kConFAB provided research resources used in this study. ABS coordinated the ENIGMA consortium. AB, EAV, MPV, PD, AK, ABS, LW and SK contributed to the conception and design of the study, contributed to obtain all necessary approvals and clearances to conduct the research, contributed to data acquisition, contributed to data analysis, contributed to grant funding, revised the manuscript for important intellectual content and approved the final version. MdH contributed to obtain all necessary approvals and clearances to conduct the research, contributed to the conception and design of the study, contributed to data acquisition, contributed to data analysis, contributed to grant funding, wrote the manuscript and approved the final version.
Funding PD, MPGW, EAV, AB, AK and MH have received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 634935. RL is supported by a Normandy-University, Federation-Hospitalo-Universitaire (FHU) grant. VL is supported by a Mackenzie Family Cancer Postdoctoral Fellowship. RB is supported by funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 634935. AM is supported by a French Cancéropôle Nord-Ouest (CNO) grant. AB is supported by Mrs Berta Kamprad Foundation. EAV and MH are supported by Spanish Instituto de Salud Carlos III (ISCIII) funding (grants PI17/00227 to EAV and PI15/00059 to MH), an initiative of the Spanish Ministry of Economy and Innovation partially supported by European Regional Development FEDER Funds. ABS is supported by an NHMRC Senior Research Fellowship (ID1061779). LCW is supported by the Rutherford Discovery Fellowship. SK is supported by Ligue Contre le Cancer, Normandie. kConFab is supported by a grant from the National Breast Cancer Foundation, and previously by the National Health and Medical Research Council (NHMRC), the Queensland Cancer Fund, the Cancer Councils of New South Wales, Victoria, Tasmania and South Australia and the Cancer Foundation of Western Australia.
Competing interests VH, SW, PT, RK were employees of Ambry Genetics when they were engaged with this project. KSH was employee of GeneDx when she was engaged with this project. EDR has consulting or advisory roles in Amgen, Bayer, Genómica, Servier and Merck. EDR has got research funding from: Roche, Merck-Serono, Amgen, AstraZeneca and Sysmex.
Ethics approval Ethics approval Academic Hospital San Carlos ethics committee (reference numbers 15/139 E and 16/505 E). The SCAN-B study has been approved by the Lund Regional Ethical Review Board, Sweden (approval number 2009/658). The fimbriae tissue samples were obtained and analysed with approval by the Lund Regional Ethical Review Board, Sweden (approval number 2014/717). French Biomedicine Agency. CASOHAR trial ethic committee (NTC NTC02560818). Ambry Genetics’ patient’s information has been de-identified, and this study has been approved and carried out in accordance with the recommendations of the Western Institutional Review Board (WIRB; IRB Tracking Number:20171324). Whole-transcriptome RNAseq study was approved by the New Zealand Southern Health and Disability Ethics Committee (12/STH/44).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Targeted RNAseq data contributed by laboratory 1 is available from Dr Sophie Krieger on reasonable request. Whole-transcriptome RNAseq data generated by laboratory 3 is available from Dr Logan Walker on reasonable request. Targeted RNAseq data contributed by laboratory 2 is available from SCAN-B and Ingrid Hedenfalk, respectively, but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. Data are however available from the authors on reasonable request and with permission of SCAN-B or Ingrid Hedenfalk.
Correction notice This article has been corrected since it was published Online First. The affiliations have been corrected.
Patient consent for publication Not required.