Germline mismatch repair (MMR) gene analyses from English NHS regional molecular genomics laboratories 1996–2020: development of a national resource of patient-level genomics laboratory records

Objective To describe national patterns of National Health Service (NHS) analysis of mismatch repair (MMR) genes in England using individual-level data submitted to the National Disease Registration Service (NDRS) by the NHS regional molecular genetics laboratories. Design Laboratories submitted individual-level patient data to NDRS against a prescribed data model, including (1) patient identifiers, (2) test episode data, (3) per-gene results and (4) detected sequence variants. Individualised per-laboratory algorithms were designed and applied in NDRS to extract and map the data to the common data model. Laboratory-level MMR activity audit data from the Clinical Molecular Genetics Society/Association of Clinical Genomic Science were used to assess early years’ missing data. Results Individual-level data from patients undergoing NHS MMR germline genetic testing were submitted from all 13 English laboratories performing MMR analyses, comprising in total 16 722 patients (9649 full-gene, 7073 targeted), with the earliest submission from 2000. The NDRS dataset is estimated to comprise >60% of NHS MMR analyses performed since inception of NHS MMR analysis, with complete national data for full-gene analyses for 2016 onwards. Out of 9649 full-gene tests, 2724 had an abnormal result, approximately 70% of which were (likely) pathogenic. Data linkage to the National Cancer Registry demonstrated colorectal cancer was the most frequent cancer type in which full-gene analysis was performed. Conclusion The NDRS MMR dataset is a unique national pan-laboratory amalgamation of individual-level clinical and genomic patient data with pseudonymised identifiers enabling linkage to other national datasets. This growing resource will enable longitudinal research and can form the basis of a live national genomic disease registry.


NDRS Validation of
Data were retrieved from the Clinical Molecular Genetics Society/Association of Clinical Genomic Science (CMGS/ACGS) annual per-laboratory audit of MMR analyses, which covered financial years 1998-2016. These counts included all English NHS MMR analyses (full-gene and targeted) performed by each laboratory in a financial year, but for some laboratories were inflated by inclusion of tests for other patients (devolved nations, overseas, private, research) and MSI analyses.
The following adjustments were made (Supplementary Table 2): (i) Activity 1996-7 and 1997-8. First NHS MMR analyses were reported in 1996, but the CMGS/ACGS audit was only initiated in 1998. Activity for these two years was estimated by ascribing for these two years the same laboratory-specific activity registered for 1998-9.
(ii) The proportion of MMR analyses in the CMGS/ACGS data which comprised germline MMR analyses specific to English NHS patients (versus devolved nations, overseas, private and research patients and MSI analyses). This was estimated by comparing analyses counts in the CMGS/ACGS audit to counts in the NDRS dataset, for years where both were available, to generate a laboratory-specific adjustment factor (sum of CMGS/ACGS audit analyses over the sum of NDRS total analyses for the overlapping timeperiod. This adjustment was then applied to 'raw' CMGS/ACGS analyses counts for the years pre-dating NDRS data submissions, to generate 'down-adjusted' CMGS/ACGS MMR analyses counts approximating germline MMR analyses specific to English NHS patients. (iii) Targeted MMR analyses counts for the 4/13 laboratories not submitting all targeted analyses to NDRS. A year-specific full-gene analyses to targeted analyses ratio was generated using counts of full-gene and targeted analyses submitted to NDRS by the other (iv) Breakdown of full gene versus targeted MMR analyses for years pre-dating NDRS submission (for which only CMGS/ACGS audit data were available). This was estimated by applying the year-specific full-gene analyses: targeted analyses ratio calculated above, to the 'down-adjusted' CMGS/ACGS MMR analyses counts. For years where a yearspecific full-gene analyses: targeted analyses ratio was incalculable, the average ratio for the calculable years was applied (1.93).
(v) Estimate counts of total, full-gene and targeted NHS MMR analyses for the entire period between April 1996 -March 2020. These were derived from integration of counts of NHS MMR analyses in the NDRS germline MMR dataset where these were available and complete for a financial year. For years pre-dating NDRS data submission, the 'downadjusted' counts derived from CMGS/ACGS audit data were used.
Both NDRS and CMGS/ACGS counts include a small number of repeat MMR gene analyses for returning patients receiving subsequent MMR gene analyses from clinical genetics after ≥ 1 year (See above -defining a test episode). Patients in the NDRS germline MMR dataset with >1 test episode = 439.

Cancer registrations (Figure 3 and Supplementary Table 3)
Linkage to the National Cancer Registration and Analysis Service (NCRAS) national cancer registry was performed using pseudo-ID1 and pseudo-ID2 separately. Where linkage was successful, 3-character ICD10 site codes were retrieved from the AV2019 tumour

NDRS Validation of Linkage to the Cancer Registry:
Purpose: To evaluate integrity of the pseudonym matching process between the NDRS germline MMR dataset and other NDRS datasets including the NCRAS national cancer registry.

Method 1: External registries
Two datasets of patients with confirmed Lynch syndrome containing patient identifiers and cancer diagnoses were sourced for use as validation datasets (Newcastle University CAPP3 clinical trial and St. Mark's Hospital Polyposis registry).
Pseudo-ID1 and Pseudo-ID2 were generated for the two datasets using the same algorithms and application programming interface used for laboratory submission of data extracts to NDRS. Linkage of the two datasets using the Pseudo-IDs to the national cancer registry was undertaken. Where a patient flagged in the validation dataset as having cancer did not link to a cancer registration, a manual check of the cancer registry was undertaken using unencrypted patient identifiers.

Method 2: Laboratory test indication audit
As it was anticipated that most full gene germline analyses would be conducted in probands with cancer, each regional molecular genetics laboratory was asked to conduct an audit of up to 20 cases, supplied by NDRS which: (1) were full gene analyses of MMR/CRC gene panel (2) with a valid Pseudo-ID1 (generated from NHS number) (3) and yet did not link to a cancer registration.
Laboratories were requested for each case to provide the clinical test indication including cancer status. Where this indication included a personal history of cancer, unencrypted patient identifiers were requested to be checked against the cancer registry.

Results 1: External registries
There were 812 individuals with confirmed Lynch syndrome provided by the two external registry datasets, of whom 413 were reported to have a cancer diagnosis. 346/413 (84%) patients reported to have cancer matched a cancer registration in the national cancer registry. Of the 67/413 (16%) patients reported in the external registry datasets to have cancer but not linking to cancer registration, 37 (i) had missing or incorrect NHS numbers/ dates of birth/ postcodes such that valid Pseudo-IDs had not been generated for them, or (ii) were resident outside of England or Wales. 30 patients were reported as having cancer in the external registries, were resident in England/Wales, had appropriate Pseudo-IDs but lacked cancer registrations on the national cancer registry (

Results 2: Laboratory test indication audit
For >70% of patients in whom CRC/MMR full gene analysis was performed there was successful link to a cancer registration pre-dating the CRC/MMR analysis in the national cancer registry. In this audit, we collected data on clinical test indication for a subset of the 30% of patients receiving full gene analyses who did not link to a registered cancer. 10/12 laboratories asked to participate in the audit responded, encompassing 189 cases. Results are shown in the table below. 77% of the cases audited did not have cancer but had been offered full gene analysis on the basis of benign tumours, family history or syndromic features. For 24/189 (12.7%) the laboratory reported a cancer being documents on their LIMS system at time of MMR testing, but the cancer could not be identified on the cancer registry.

Further investigations:
The 30 cases in the external registries and the 24 cases from the laboratory audits (54 total) who were reported to have cancer but did not link to a registered cancer in the national cancer registry, were further investigated using remote access to hospital record systems available to Cancer Registration Officers within the NCRAS. Due to COVID-19-related restrictions to hospital data, so only 30/54 records could be checked. Of those 30 cases checked, the outcomes are shown in the table below. 10/30 were benign tumours, 10/30 were non-English residents or private patients whose cancers would not be registered in the English national cancer registry, and for 6/30 no evidence of cancer was found on the trust system. For the remaining four cases, one was a true invasive cancer diagnosed in 1972, one was a very recent diagnosis not yet captured on CAS and for two there was indirect mention of the cancer in the clinical notes but no formal coding of invasive cancer for the patient

Conclusion:
In summary, the sequential audit processes provide robust assurances regarding NDRS creation of linkage IDs (pseudo-ID1 and pseudo-ID2), the process of linkage to the national cancer registry, and registration of cancers. The majority of full gene analyses for which there is no linkage to the cancer registry are explained by (i) incomplete NHS numbers/ dates of birth/ postcodes which preclude the generation of linkage pseudo-IDs, (ii) patients receiving full-gene analyses for reasons other than a personal history of invasive cancer, including benign tumours (iv) cancer diagnosis outside of England or in the private sector.