Objectives: Genetic testing for the breast and ovarian cancer susceptibility genes BRCA1 and BRCA2 has important implications for the clinical management of people found to carry a mutation. However, genetic testing is expensive and may be associated with adverse psychosocial effects. To provide a cost-efficient and clinically appropriate genetic counselling service, genetic testing should be targeted at those individuals most likely to carry pathogenic mutations. Several algorithms that predict the likelihood of carrying a BRCA1 or a BRCA2 mutation are currently used in clinical practice to identify such individuals.
Design: We evaluated the performance of the carrier prediction algorithms BOADICEA, BRCAPRO, IBIS, the Manchester scoring system and Myriad tables, using 1934 families seen in cancer genetics clinics in the UK in whom an index patient had been screened for BRCA1 and/or BRCA2 mutations. The models were evaluated for calibration, discrimination and accuracy of the predictions.
Results: Of the five algorithms, only BOADICEA predicted the overall observed number of mutations detected accurately (ie, was well calibrated). BOADICEA also provided the best discrimination, being significantly better (p<0.05) than all models except BRCAPRO (area under the receiver operating characteristic curve statistics: BOADICEA = 0.77, BRCAPRO = 0.76, IBIS = 0.74, Manchester = 0.75, Myriad = 0.72). All models underpredicted the number of BRCA1 and BRCA2 mutations in the low estimated risk category.
Conclusions: Carrier prediction algorithms provide a rational basis for counselling individuals likely to carry BRCA1 or BRCA2 mutations. Their widespread use would improve equity of access and the cost-effectiveness of genetic testing.
Statistics from Altmetric.com
Genetic testing for BRCA1 and BRCA2 is generally offered to women with a strong family history of breast and ovarian cancer, but it is not possible to offer routine testing to all women presenting to family cancer clinics because it is both financially expensive and is associated with adverse psychosocial effects.1 2 Therefore, to provide a cost-efficient and clinically appropriate genetic counselling service, it is important that genetic testing is targeted at those individuals most likely to have mutations. The National Institute for Health and Clinical Excellence (NICE) guideline for the management of women at risk of familial breast cancer suggests that genetic testing should be offered when there is at least a 20% chance that a deleterious mutation is present in the family.3 However, the guideline does not specify how such a probability should be estimated. Several models that use family history information to predict BRCA1 and BRCA2 mutation carrier probabilities have been published (reviewed by Antoniou and Easton 4). Most UK clinics have been using criterion-based rules or more sophisticated tools such as BRCAPRO5 or the Manchester scoring system.6 7 More recently developed models, such as BOADICEA8 and IBIS,9 are also entering clinical practice. However, not all these models have been extensively evaluated for their ability to predict BRCA1 or BRCA2 mutation status in typical families seen and tested in family cancer clinics in the UK.
The aim of this study was to evaluate the five widely used carrier prediction algorithms: Myriad,10 BRCAPRO,5 the Manchester scoring system,6 BOADICEA8 and IBIS,9 using a large cohort of families seen in cancer genetics clinics in the UK and in which an index patient had been screened for BRCA1 and BRCA2 mutations.
Six clinical genetics centres in the UK contributed data. Eligible families were those where family mutation status was unknown when genetic testing was initiated, at least one family member (index case) was screened for BRCA1 and/or BRCA2 mutations using a primary mutation search, and information on the mutation-testing methods used was available. Families identified through a research study were also eligible, provided that all the families in the study were submitted. Families indicated to be of Ashkenazi Jewish origin were excluded, because in general, individuals from these families were only tested for the three Ashkenazi Jewish founder mutations. As families tested for specific mutations were not eligible for inclusion in the study, it was not therefore possible for us to carry out valid comparisons within this group of families using only the limited number of families of Ashkenazi Jewish origin that had been submitted by the clinical genetics centres.
Pedigree data were provided in standard linkage format with year of birth, age at last follow-up or age at death, and details of any cancer diagnosis (cancer site, age at diagnosis) for each subject. Such data were already available in electronic format for some clinics, and for others the data had to be manually entered from the hand-drawn pedigree in the medical notes.
The completeness of the data capture varied both within and between clinics. Inconsistencies were checked and corrected centrally. When ages were missing but a date of birth was given, the age was calculated as years from the date of birth to the date the pedigree was drawn for live individuals, and to the date of death for deceased individuals. When age was recorded as a range the age was assumed to be the middle of the range (eg, if recorded as “50s”, an age of 55 years was assigned). Cancers were only included when there was no ambiguity or speculation evident in the recorded data and age information was available. For example, an individual with a record of “?breast cancer ?age” would not be included as a breast cancer case in the analyses. Date of birth and/or age data were completely missing for approximately 57% of all the individuals submitted. The proportion with missing data was lower for women (47%). We used two methods of analysis to account for this. In the first analysis, we made no assumptions about these individuals and censored them at the age of 0 years. In a second analysis, we inferred the year of birth, based on the information available for other family members, and assumed they were unaffected and censored at the date of pedigree drawing or the age of 70 years, whichever occurred first. We found that all the genetic models performed poorly when we used the inferred data, suggesting that our assumptions about the ages at last follow-up or years of birth for the inferred individuals were not valid. The results presented here were therefore based on the first method, using the data actually available to the geneticist.
Families were excluded from the analyses if the age information on the index tested individual was not available and had to be inferred, because this was considered to be an indicator of data reliability. Index individuals were assumed to be BRCA1 and BRCA2 mutation carriers if they carried a pathogenic mutation according to internationally recognised criteria (http://research.nhgri.nih.gov/projects/bic/).
We used the genetic risk models BOADICEA, BRCAPRO and IBIS, and the empirical Manchester scoring system and Myriad prevalence tables to compute a probability or a score for the likelihood of carrying a BRCA1 or a BRCA2 mutation. The development of these models has been reviewed in detail elsewhere.4 In brief, BOADICEA, BRCAPRO and IBIS estimate the probability of carrying a BRCA1 or a BRCA2 mutation and the probability of developing breast cancer. BOADICEA and BRCAPRO can also be used to estimate ovarian cancer risks.
BOADICEA allows for families of any structure or disease pattern. The latest version incorporates the effects of breast, ovarian, male breast, pancreatic and prostate cancer. This implementation is currently available for online use via an internet interface (http://www.srl.cam.ac.uk/genepi/boadicea/boadicea_home.html). BRCAPRO incorporates information on only first-degree and second-degree relatives of the index case and does not incorporate prostate or pancreatic cancer. We used the R version (BayesMendel; http://astor.som.jhmi.edu/BayesMendel/) in order to automate the calculations, but this is essentially the same as other, more user-friendly implementations such as the CancerGene interface (CancerGene; http://www.swed.edu/home_pages/cancergene) or the pedigree management software Cyrillic. Certain parameters of this model can be customised, for example the BRCA1 and BRCA2 mutation frequencies.
We used version 6a of IBIS,9 which incorporates information on all first-degree and second-degree relatives and affected cousins and half-sisters. In this version, the breast/ovarian cancer status of the proband is also taken into account, a feature not available in the initial release. Under IBIS the index case can only be female, so we excluded families with a male index for the IBIS analysis. IBIS can also incorporate the effects of other risk factors such as weight, height and hormone replacement therapy, but this information was not available. Probands were assumed to be parous if information on offspring was available in the pedigrees. IBIS does not include the effects of other BRCA1-associated and BRCA2-associated cancers such as prostate or pancreatic cancer.
The Manchester scoring system6 7 computes a score for the likelihood of identifying a BRCA1 or BRCA2 mutation, but not a carrier probability. It involves working through paternal and maternal lineages and allocating a score for each affected individual based on the type of cancer and age at diagnosis. The scoring system considers male and female breast cancer, ovarian, prostate and pancreatic cancer.
The Myriad prevalence tables10 provide the probability of detecting a BRCA1 or BRCA2 mutation and are based on the observations of deleterious mutations by Myriad Genetics Laboratories through its clinical testing service. Only information on breast and ovarian cancer occurrence among first-degree and second-degree relatives is used in the calculations. These tables are periodically updated on the web (http://www.myriadtests.com/provider/brca-mutation-prevalence.htm). We used the “spring 2006” version of the tables, which were based on 49 149 tests.
As BOADICEA, BRCAPRO and IBIS compute the probability of carrying a mutation, rather than the probability of detecting a mutation, we needed to allow for the sensitivity of the mutation-testing process. BOADICEA incorporates a parameter for the mutation-testing sensitivity for BRCA1 and BRCA2. This was assumed to be equal to the gene-specific mutation-screening sensitivities derived as described below. For BRCAPRO and IBIS, we multiplied the predicted carrier probability by the sensitivity parameter for each gene separately.
Different laboratories used different methods for mutation testing. The extent of each mutation search carried out was categorised for each index by indicating which exons of each gene had been tested and the method used for that exon. Screening of each gene for large genomic rearrangements was also indicated. The overall sensitivity for each mutation search was then estimated by summing the estimated sensitivity for each exon tested (method-specific) weighted by exon length. Large rearrangements were assumed to account for 10% of all BRCA1 mutations and 5% of all BRCA2 mutations (the assumed method specific sensitivities for the various methods used are given in supplementary table 1 online). Note that some mutations may not be detected by any of the methods, particularly variants of uncertain significance that are disease associated. These were ignored, because their frequency is uncertain but probably small compared with that of known pathogenic variants, and because they were not considered in the development of any of the models.
We evaluated three properties of the risk prediction models: calibration, discrimination and accuracy. A model is said to be well-calibrated if it accurately predicts the total number of BRCA1 and BRCA2 mutation carriers. Calibration provides an indication of the overall fit of the model to the data and is useful for planning research studies, clinical trials or intervention strategies. To compute the number of mutations predicted under BOADICEA, BRCAPRO, IBIS and the Myriad tables we summed the probabilities of detecting a BRCA1 and a BRCA2 mutation across all families (the expected number). These were compared against the actual number of mutations detected (the observed number) using the Pearson χ2 goodness-of-fit test. Categories were grouped together when the expected numbers were small. Unless otherwise specified, these tests had two degrees of freedom (the sum of non-carriers, BRCA1 carriers and BRCA2 carriers is constrained within each category).
Discrimination is the ability of the model to distinguish between a mutation carrier and a non-carrier at the individual level. This was evaluated using the area under the receiver operating characteristic (ROC) curve. This statistic gives the probability that the carrier probability/score given by the model will be higher for a randomly chosen carrier than for a randomly chosen non-carrier. A value of 0.5 indicates that the model is no better than chance in discriminating between carriers and non-carriers whereas a value of 1 indicates perfect discriminatory power. We also compared the sensitivity, specificity, positive and negative predictive values of the models at carrier probability thresholds commonly used in selecting families for clinical testing.
To evaluate whether the predicted probabilities fit the observed mutation status at the individual level (accuracy) we also carried out tests that assess the probabilities given by the models for systematic overprediction or underprediction (underdispersion or overdispersion) as described previously.11 12 The degree of overprediction or underprediction was estimated using logistic regression by treating the observed mutation status as the dependent variable and the log-odds of the probability of detecting a mutation given by the model as the independent variable. To test for dispersion, we tested the null hypothesis that the estimated coefficient was equal to 1 in a logistic regression model without a constant term.
Data for 2140 families were available. In total, 84 families were excluded because the age information on the index tested individual was not available. An additional 122 families of Ashkenazi Jewish origin were also excluded. Tables 1 and 2 summarise the characteristics of the remaining 1934 eligible families used in the analyses. Of these, 211 (11%) were found to carry a deleterious mutation in BRCA1 and 154 (8%) carried a deleterious BRCA2 mutation. In total, 45 families had a male index case, and these were therefore excluded from the evaluation of the IBIS model. The mean sensitivity of the mutation detection techniques varied somewhat between clinics (0.46–0.77 for BRCA1 and 0.46–0.76 for BRCA2) and was estimated to be 0.60 for BRCA1 and 0.61 for BRCA2 in the entire sample.
The observed and expected number of mutations in each gene as predicted by BOADICEA, BRACAPRO and IBIS are shown in table 3. The calibration of BOADICEA was better than that of BRCAPRO and IBIS. Overall, BOADICEA predicted 201 BRCA1 and 158 BRCA2 mutation carriers among index individuals, compared with 211 and 154 observed respectively (p = 0.71). The numbers predicted by BRCAPRO (324 for BRCA1 and 76 for BRCA2) were different from those observed (p<0.0001). The standard implementation of BRCAPRO is based on population frequencies for deleterious BRCA1 and BRCA2 variants of 0.0006 and 0.00022 respectively. The BRCA2 mutation frequency is probably too low based on currently available UK population data,13 14 and BRCAPRO would be expected to underpredict the number of BRCA2 carriers. We therefore repeated the predictions using a BRCA2 mutation population frequency of 0.001 (the frequency assumed in BOADICEA) (table 3). The expected number of mutations (249 for BRCA1 and 220 for BRCA2) were now much closer to the observed numbers, but the difference was still significant (p<0.0001). Of the 1889 eligible families for IBIS evaluation, 210 index patients carried BRCA1 mutations and 147 BRCA2 mutations. IBIS underpredicted the total number of mutations for both BRCA1 and BRCA2 (184 and 117 respectively, p = 0.001). The Myriad tables also underpredicted the number of BRCA1 and BRCA2 mutations (312 predicted vs 365 observed, p = 0.001; table 4).
Tables 3 and 4 also show the observed and expected number of BRCA1 and BRCA2 carriers according to the predicted carrier probability in eight categories ranging from <5% to ⩾50%. BOADICEA and IBIS predict the number of mutations well for carrier probabilities of ⩾15%, but tend to underestimate the number of mutations for carrier probabilities of <15%. A similar pattern can be seen for both implementations of BRCAPRO, where the numbers of mutations in low carrier probability categories are underpredicted. However, it also tends to overpredict the number of mutations for carrier probabilities of ⩾50%. The Myriad tables also significantly underpredicted in the 5–10% mutation prevalence category.
The Manchester scoring system does not predict individual carrier probabilities. However, it is possible to compare the observed mutation frequency within each threshold category with the proportions expected from published data7 (table 5). Among families with a combined BRCA1 and BRCA2 score of ⩽14, 5% were found to carry BRCA1 and BRCA2 mutations compared with 2% expected. At least 10% of families with a combined score of ⩾15 were mutation-positive. However, in these categories the observed mutation frequencies are somewhat lower than those previously reported.7
The area under the ROC curve was 0.77 for BOADICEA, 0.76 for BRCAPRO, 0.74 under IBIS, 0.75 for the Manchester scoring system and 0.72 based on the Myriad prevalence tables for BRCA1 and 2 mutations combined (table 6). All were significantly different from 0.5, but the BOADICEA C-statistic was higher than the IBIS, Manchester and Myriad models (p = 0.009, 0.01 and 0.0003 respectively), BRCAPRO was better than IBIS and Myriad (p = 0.03 and 0.0005 respectively) and Manchester was better than Myriad (p = 0.02). For the BRCA1-specific predictions, the area under the ROC was 0.82, 0.79, 0.79 and 0.77 for BOADICEA, BRCAPRO, IBIS and Manchester respectively. Again BOADICEA was better than all the other models (p = 0.002, p = 0.04, p = 3×10−6 for BRCAPRO, IBIS and Manchester respectively). The area under the ROC curve for discriminating BRCA2 mutation carriers was the same (0.68) for BOADICEA, BRCAPRO and Manchester and lower (0.62) for IBIS.
Table 7 shows the performance of the different models at carrier probability/prevalence thresholds of 10% and 20% for BRCA1 and BRCA2 combined for BOADICEA, BRCAPRO, IBIS and Myriad, and the equivalent threshold scores of 15 and 17 for the Manchester scoring system. The highest sensitivity at these cut-offs was given by the Manchester scoring system, but the specificity was much lower than BOADICEA and BRCAPRO. The BOADICEA and BRCAPRO models performed similarly. IBIS and Myriad had much lower sensitivity than the other three models at both cut-off points. Table 7 also shows the performance of the models for BRCA1 and BRCA2 separately, at a threshold of 10%. BRCAPRO had the highest sensitivity (89.6%) for BRCA1, compared with 82.9, and 83.4% for BOADICEA and Manchester respectively, but a lower specificity compared with BOADICEA (48.2 vs 63.8%). IBIS had the lowest sensitivity for BRCA1 (72.4%). For BRCA2, BRCAPRO (38.3%) and IBIS (44.9%) had much lower sensitivities than BOADICEA (67.5%) and Manchester (75.3%), although conversely BRCAPRO and IBIS had higher specificities.
The estimated dispersion parameters for BOADICEA, BRCARPO, IBIS and Myriad tables are shown in table 8. All parameters were significantly different from 1, suggesting that all models are overdispersed. The least overdispersed model was Myriad, followed by BOADICEA, and the most overdispersed (least accurate) was IBIS. The difference between Myriad and BOADICEA was not significant, but Myriad was less dispersed than BRCAPRO and IBIS (p = 0.0006 and p = 4×10−7 respectively). BOADICEA was also less dispersed than BRCAPRO and IBIS (p = 0.01, p = 0.0002, respectively). The overdispersion in the three genetic models was mainly driven by differences in the observed mutation status and that predicted for carrier probabilities of <15% (results not shown) in line with the results in Tables 3 and 4, where there is underprediction of the number of mutation carriers.
We evaluated the performance of the genetic models BOADICEA, BRCAPRO and IBIS, and the empirical Manchester scoring system and Myriad tables in individuals screened for BRCA1 and/or BRCA2 mutations identified through clinical genetics centres in the UK. This is the largest validation study to date to evaluate all the recently published models. The overall mutation prevalence of 19.8% is lower than would be expected under the current NICE criteria, under which families are eligible for testing if the predicted carrier probability is at least 20%, reflecting the fact that, before the guidelines, genetic testing was offered to individuals with lower carrier probabilities.
BOADICEA outperformed BRCAPRO, IBIS and Myriad in predicting the number of mutation carriers (calibration), but all models tend to underpredict the number of mutations in families with low predicted carrier probability, as previously reported for BRCAPRO.17 The observed prevalence of mutations in the low-score category under the Manchester system was also higher than that expected.7 This underprediction was particularly unexpected for BOADICEA, given that the majority of the data used to develop the model came from families of breast cancer cases unselected for family history.
The “low predicted risk” families represent a substantial proportion of the data (30–50% of families depending on model). Missing information on cancer diagnoses may account for part of the discrepancy, as cancers with ambiguity in diagnosis or age could not be included in the analyses. Another explanation might be preferential inclusion of mutation carrier families in the dataset. This may have occurred because of preferential ascertainment of mutation carrier families when identifying families that had undergone BRCA1 and BRCA2 mutation testing using laboratory records, or preferential tracing and entering of pedigree data. Another possibility is that the genetic counsellor may have had prior information that a mutation was segregating in the family. We were able to exclude families when the index individual was tested for a specific known mutation, but there may have been other instances when another individual had been previously tested and this prompted the index to be screened. Alternatively, the genetic counsellor may have had access to phenotypes that are related to carrier status but not considered by these models, such as histopathological features of the breast cancers in the family16 or the occurrence of other cancer types not accounted by the models.
In our analysis, we used gene-specific sensitivities that were derived for each screened individual separately based on the extent of screening and the mutation detection methods used. Therefore, the predicted number of mutations under BOADICEA, BRCAPRO and IBIS depend to an extent on the assumed sensitivity of the detection techniques used to screen for BRCA1 and BRCA2 mutations (supplementary table online). This is a potential limitation in assessing the calibration of these models. An increase in the number of mutations predicted by the three genetic models would be expected if the sensitivity of the detection techniques was higher, and a decrease if the sensitivity of the methods used was lower than we have assumed. However, we used generally accepted values for the sensitivity of the various screening methods. Moreover, the mutation screening methods used varied across centres and individuals, and the sensitivity parameter for each individual was derived as a function of both the methods used and the exons tested. Therefore, small misspecification of the sensitivity of a particular technique, especially within the known levels of uncertainty, is unlikely to have had a marked effect on our results.
We found that all models were overdispersed, suggesting that some predictions may be too extreme11—that is, either due to low predictions in carrier individuals, or high predictions in non-carriers. We found that this was mainly driven by observations in the low predicted carrier probability category, and may be explained by the reasons outlined above. Some models however, were significantly more dispersed than others. The least dispersed models were Myriad and BOADICEA. BOADICEA has also been reported to be more accurate at individual level than BRCAPRO in a previous smaller study.17
All models perform significantly better than chance in discriminating between carriers and non-carriers. The area under the ROC curve statistics were reasonably similar across models (0.72–0.77) but BOADICEA performed best. Previous validation studies have also found that BRCAPRO, BOADICEA, Myriad, and the Manchester scoring system discriminate reasonably well between carriers and non-carriers.6 11 15 17–19
We investigated the sensitivity and specificity of the models at widely suggested thresholds. However, our results demonstrate that the sensitivity of a model at fixed thresholds is not necessarily the most appropriate statistic for evaluating its clinical utility and it may be even misleading if the model is not well calibrated. The primary purpose of selecting a threshold is to reduce the number of mutation searches carried out while targeting the available resources at those families most likely to carry a mutation; it would, of course, be possible to achieve a sensitivity of 100%, simply by testing all families. At a fixed threshold, the number of families that would be eligible for testing will be different under different models and so the cost-effectiveness of the models will vary. To illustrate this, we computed the number of families that would be eligible for testing and the number of mutation-positive families that would be missed under each model, based on the 20% carrier probability threshold. Less than half the families (946) families would be eligible for testing using BOADICEA, and 70 (of 365) mutation positive families would be missed. The equivalent numbers are 976 and 69 for BRCAPRO, 762 and 115 for IBIS (out of a total of 1889 and 357 positive for which IBIS could be used), and 492 and 179 for Myriad. Based on a Manchester score of ⩾17, 1205 families would be tested and 47 would be missed. A fairer comparison of the models would be to compare their performance for different thresholds for each model, making the number of families eligible for testing the same. If the 1205 families with the highest carrier probabilities under each model were to be tested, the number of mutations missed would be 42 under BOADICEA, 44 under BRCAPRO, 47 under the Manchester score, 53 under IBIS and 64 under the Myriad tables.
Several of these models could be used to further improve the cost-effectiveness of genetic testing by using the gene-specific carrier probabilities with gene-specific testing thresholds. For example an individual could be screened for BRCA1 if the BRCA1 carrier probability is ⩾0.10, for BRCA2 if the corresponding probability is ⩾0.10, and for both genes (in order of probability) if both are ⩾0.10. Using this approach 1538 (BRCA1 765, BRCA2 773) tests would have been carried out under BOADICEA and 57 mutations would have been missed. In contrast, if the current NICE guidelines were applied, 70 mutations would have been missed, despite a greater number of tests being carried out (1892 tests: 946 for BRCA1 and 946 for BRCA2 for the families with a combined carrier probability ⩾20%). A potential disadvantage of such an approach would be potential loss of efficiency in testing for one gene only.
Although all the existing models capture information on breast and ovarian cancer family history, other data can provide predictive value. In particular, BOADICEA and Manchester also incorporate data on prostate and pancreatic cancer, the other cancers for which there is a clear association with carrier status. One important piece of information that may improve the carrier predictions by the various models is tumour type. BRCA1 mutations (but not BRCA2 mutations) are known to predispose strongly to oestrogen receptor-negative, basal-type breast cancer.20 Collecting this additional pathology information and developing the risk models further to take such information into account could substantially improve the efficiency of testing.
These data show that the currently available carrier prediction models perform well in typical breast–ovarian cancer families seen in cancer genetics clinics. All models performed reasonably well, but BOADICEA performed best for most of the performance measures. More systematic use of these models in the clinic allied to specific thresholds would have the advantage of ensuring equity of access to genetic testing as well as making the management decision-making process clearer and more explicit. Nevertheless, risk prediction algorithms cannot replace decision-making based on clinical criteria and any thresholds should be used only as guidelines and as an adjunct to clinical judgement. Although these data evaluate the use of these models for clinical genetics testing, they could also be used in primary and secondary care in selecting patients for tertiary referral. Further evaluation of the usability of the different models in different settings and their performance in terms of predicting the absolute risk of developing cancer would be needed before their routine use.
We thank J Tyrer for providing the batch version of IBIS.
Funding: This study was supported by a grant from the UK Department of Health. PDPP is Cancer Research UK Senior Clinical Research Fellow. DFE is a Cancer Research UK principal research fellow. ACA is funded by CR-UK.
Competing interests: None.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.