Comprehensive epithelial tubo-ovarian cancer risk prediction model incorporating genetic and epidemiological risk factors

Background Epithelial tubo-ovarian cancer (EOC) has high mortality partly due to late diagnosis. Prevention is available but may be associated with adverse effects. A multifactorial risk model based on known genetic and epidemiological risk factors (RFs) for EOC can help identify women at higher risk who could benefit from targeted screening and prevention. Methods We developed a multifactorial EOC risk model for women of European ancestry incorporating the effects of pathogenic variants (PVs) in BRCA1, BRCA2, RAD51C, RAD51D and BRIP1, a Polygenic Risk Score (PRS) of arbitrary size, the effects of RFs and explicit family history (FH) using a synthetic model approach. The PRS, PV and RFs were assumed to act multiplicatively. Results Based on a currently available PRS for EOC that explains 5% of the EOC polygenic variance, the estimated lifetime risks under the multifactorial model in the general population vary from 0.5% to 4.6% for the first to 99th percentiles of the EOC risk distribution. The corresponding range for women with an affected first-degree relative is 1.9%–10.3%. Based on the combined risk distribution, 33% of RAD51D PV carriers are expected to have a lifetime EOC risk of less than 10%. RFs provided the widest distribution, followed by the PRS. In an independent partial model validation, absolute and relative 5-year risks were well calibrated in quintiles of predicted risk. Conclusion This multifactorial risk model can facilitate stratification, in particular among women with FH of cancer and/or moderate-risk and high-risk PVs. The model is available via the CanRisk Tool (www.canrisk.org).

, where , ! is the person's genetic dosage for variant ", which is in the range [0, 2]. The dosage may come directly from the person's genotype or may be imputed.
The mean PRS is given by The model takes as input the standard normal PRS (z-score), given by ()* + = (()* − @ %&' )/2 %&' , and the square root of the overall polygenic variance in the model explained by the PRS is C = 2 %&' 1. 4156 .
In [2] the polygenic variance was determined to be 1.434, which implicitly included the effects of RAD51D, RAD51C and BRIP1. RAD51D, RAD51C and BRIP1 are explicitly included in this version of the model, and so their effects must be subtracted from the polygenic variance, leaving 1.4156.
The above procedure allows the set of variants to change between people, for instance, people may only have been genotyped for a subset of the known variants (e.g., an older PRS or missing genotypes), or more variants may be added.

Parameterisation of the Risk Factors
The model has been extended to incorporate the effects of epidemiological risk factors. In the model, risk factors are parameterised by their population distribution and relative risk, given in Table s2.

Parameterisation of the Rare Variants
The effects of pathogenic variants are parameterised in the model via their allele frequency and relative risk, give in Table s3.

Other Model Components
Previously [2], the model included FH of EOC and first female breast cancer (BC). To align with the BOADICEA model [18], the model was extended to take account of female contralateral BC, male BC, prostate cancer and pancreatic cancer, assuming that the relative risk for carriers of pathogenic variants (PVs) in BRAC1 and BRCA2 is the same as that in the BOADICEA model [19]. We assumed that PVs in RAD51D, RAD51C and BRIP1 do not increase the risks for these cancers relative to the population.
Further, using the methodology developed in [20] we included the effects of tumour pathology subtype of a first BC for females, where we assumed that the pathology proportions for carriers of PVs in BRAC1 and BRCA2 are the same as those in the BOADICEA model [21] and that the pathology proportions for carriers of PVs in RAD51D, RAD51C and BRIP1 are the same as those in the general population. FH of these cancers and first BC pathology can be indicative of PVs in BRAC1 or BRCA2.

Study Subjects
The United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) is a randomised controlled trial for assessing the effect of screening on EOC mortality initiated in 2001 [22,23]. Postmenopausal females aged 50-74 years were invited to participate. Participants provided a blood sample and completed a baseline questionnaire that included information on personal and family cancer history, number of pregnancies lasting at least 6 months, OCP, MHT, sterilisation, height and weight. Information on endometriosis was not collected. All participants provided written informed consent. Females were excluded if they were at increased risk of EOC due to family history of breast or ovarian cancer or if they were known carriers of EOC predisposing PVs, or had self-reported previous bilateral oophorectomy or ovarian malignancy or active non-ovarian malignancy. Two follow-up questionnaires were administered, the first 3-5 years post-randomisation and the second in 2014 [23,24]. Notification of cancer diagnoses and deaths were through NHS Digital for the females residing in England and the Northern Ireland Cancer Registry and Central Services Agency for those residing in Northern Ireland. For females who developed EOC, medical notes were retrieved and independently reviewed by an Outcomes Review Committee who assigned histological subtype, stage and grade.
For the present study, a nested case-control design was adopted. Cases were defined as females diagnosed with incident invasive epithelial ovarian or fallopian tube cancers or primary peritoneal cancer. To exclude potentially prevalent cases at recruitment, we predicted the risk of EOC from the age at recruitment "plus one year" and excluded samples with follow-up time less than one year. Two random controls were selected per case, matched on regional centre, age at randomisation and year at recruitment [24]. Participants who had a previous cancer diagnosis except for breast and non-melanoma skin cancer before recruitment were excluded.

Model Discrimination
The model discrimination was assessed by area under the ROC curve (AUC) and Harrell's C index [25]. The AUC was estimated as the weighted probability that the predicted risks for cases outrank the risks for controls. Suppose , is the EOC indicator (i.e., , = 1 for cases and , = 0 for controls), then, for any casecontrol pair of individuals " and U, VWX = ( ,-!./0-1 =( ! > ( 2 Z, ! = 1, , 2 = 0> where \ denotes the indicator function and ] ! is the weight for individual ".

Pedigree construction
The baseline questionnaire collected information on whether the mother was diagnosed with ovarian or breast cancer and the number of daughters, number of grandmothers, number of granddaughters, number of sisters and number of aunts diagnosed with ovarian or breast cancer. Based on these, we constructed a pedigree for each case and control that included information on first-and second-degree relatives. The size of each nuclear family within each pedigree was determined by randomly sampling from the cohort-specific distribution of family sizes for the UK [26]. Using the information on the year of birth and age of the proband, reported at baseline, we randomly assigned each family member a year of birth and age at last observation under the following assumptions: (a) the age gap between successive generations ranged from 18 to 45 years with a mean age gap equal to 25 years; (b) spouses were assumed to have the same age; (c) vital status and age at last follow-up were assigned based on cohort-specific life expectancy tables for England and Wales [27]. For simplicity, we assumed that the reported breast and ovarian cancers in relatives occurred in different family members. We assumed that all affected individuals came from the maternal side and assigned an age at ovarian/breast cancer diagnosis by sampling from the cohort-specific probability based on the population incidences of ovarian or breast cancer for the UK [28].

Validation PRS
As the validation set had not been genotyped for all the common variants in Table s1, a PRS was constructed using the set of variants given in Table s5.   Figure ((a)-(d)) shows the risk based on the PV status of the mother assuming the proband tests negative for the PV the mother carries. In the case of "untested" mother, the proband is assumed to test negative for all genes. In the case of "FH Only", neither the proband nor her mother are tested. Screening test sensitivities are set to 1.0 for all genes. The corresponding values are given in Table s6.  Figure s1 based on the screening test results for pathogenic variants (PVs) in BRCA1, BRCA2, RAD51D, RAD51C and BRIP1, where the proband tests negative for PVs in all genes, and her mother tests positive for a PV in the indicated gene. In the case of "untested", the proband tests negative for all genes, while the mother is untested. In the case of "FH Only", neither the proband nor her mother are tested. Each Figure is based on the same family structure, but with an increasing number of affected first-degree relatives, as indicated by the insert pedigree diagrams in Figure s1.