Article Text

Original research
Prevalence and clinical implications of germline pathogenic variants in cancer predisposing genes in young patients across sarcoma subtypes
  1. Nathalia de Angelis de Carvalho1,
  2. Karina Miranda Santiago1,
  3. Joyce Maria Lisboa Maia2,
  4. Felipe D’Almeida Costa3,
  5. Maria Nirvana Formiga4,
  6. Diogo Cordeiro de Queiroz Soares4,
  7. Daniele Paixão4,
  8. Celso Abdon Lopes de Mello2,
  9. Cecilia Maria Lima da Costa5,
  10. José Claudio Casali da Rocha4,
  11. Barbara Rivera6,7,
  12. Dirce Maria Carraro1,8,
  13. Giovana Tardin Torrezan1,8
  1. 1Clinical and Functional Genomics Group, ACCamargo Cancer Center, Sao Paulo, São Paulo, Brazil
  2. 2Clinical Oncology Department, ACCamargo Cancer Center, Sao Paulo, Brazil
  3. 3Department of Anatomic Pathology, ACCamargo Cancer Center, Sao Paulo, Brazil
  4. 4Oncogenetics Department, ACCamargo Cancer Center, Sao Paulo, Brazil
  5. 5Pediatric Oncology Department, ACCamargo Cancer Center, Sao Paulo, Brazil
  6. 6Molecular Mechanisms and Experimental Therapy in Oncology Program, IDIBELL, Barcelona, Spain
  7. 7Gerald Bronfman Department of Oncology, McGill University, Montreal, Québec, Canada
  8. 8National Institute of Science and Technology in Oncogenomics and Therapeutic Innovation, Sao Paulo, Brazil
  1. Correspondence to Dr Giovana Tardin Torrezan, Clinical and Functional Genomics Group, ACCamargo Cancer Center, Sao Paulo 01509-900, Brazil; giovana.torrezan{at}


Background Sarcomas are a rare and diverse group of cancers occurring mainly in young individuals for which an underlying germline genetic cause remains unclear in most cases.

Methods Germline DNA from 177 children, adolescents and young adults with soft tissue or bone sarcomas was tested using multigene panels with 113 or 126 cancer predisposing genes (CPGs) to describe the prevalence of germline pathogenic/likely pathogenic variants (GPVs). Subsequent testing of a subset of tumours for loss of heterozygosity (LOH) evaluation was performed to investigate the clinical and molecular significance of these variants.

Results GPVs were detected in 21.5% (38/177) of the patients (15.8% in children and 21.6% in adolescents and young adults), with dominant CPGs being altered in 15.2% overall. These variants were found in genes previously associated with the risk of developing sarcomas (TP53, RB1, NF1, EXT1/2) but also in genes where that risk is still emerging/limited (ERCC2, TSC2 and BRCA2) or unknown (PALB2, RAD50, FANCM and others). The detection rates of GPVs varied from 0% to 33% across sarcoma subtypes and GPV carriers were more likely to present more than one primary tumour than non-carriers (21.1%×6.5%; p=0.012). Loss of the wild-type allele was detected in 48% of tumours from GPV carriers, mostly in genes definitively associated with sarcoma risk.

Conclusion Our findings reveal that a high proportion of young patients with sarcomas presented a GPV in a CPG, underscoring the urgency of establishing appropriate genetic screening strategies for these individuals and their families.

  • genetic predisposition to disease
  • genetic variation
  • neoplasms

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Detecting germline pathogenic variants (GPVs) in cancer predisposing genes (CPGs) is highly relevant in current clinical practice, both for genetic counselling purposes and treatment decisions.

  • For most sarcoma subtypes, the extent and clinical significance of germline alterations in CPGs are not fully known.


  • In our cohort, one-fifth of patients with sarcomas under 40 years old had a GPV in a CPG.

  • Detection rates differed considerably across sarcoma subtypes, and loss of heterozygosity was more frequent in known sarcoma-associated genes.

  • This is the most comprehensive study regarding sarcoma high-risk germline variants in a diverse and under-represented population.


  • Current recommendations for genetic screening would miss a substantial number of patients with sarcoma with GPVs.

  • Genetic screening in patients with sarcoma, particularly in younger patients, should be considered even in the absence of family history or clinical signs associated with a syndrome.


Sarcoma is a group of rare mesenchymal tumours that predominantly affect younger populations, accounting for 20% of tumours in children, 10% in adolescents and young adults1 and only 1% of malignant tumours in adults.2 Sarcomas encompass more than 100 histological types/subtypes, exhibiting marked clinical heterogeneity3 and diverse somatic drivers, such as specific gene fusions in particular histological subtypes. Regarding risk factors, it is known that prolonged exposure to ionising radiation increases the risk of sarcomas, especially bone tumours.4 5 Among heritable factors, individuals with specific hereditary cancer predisposition syndromes (HCPSs) carrying germline pathogenic/likely pathogenic variants (GPV) in tumour suppressor genes (TSGs) seem to have an increased risk of developing sarcomas, particularly when exposed to radiation.6–8 Furthermore, some specific HCPSs are known to be associated per se with a higher risk of developing sarcomas, such as Li-Fraumeni syndrome, hereditary retinoblastoma and neurofibromatosis type 1.9

The recent expansion of genetic screening and genomic characterisation of tumours has provided evidence of a high frequency of GPVs in patients with sarcoma, ranging from 13.6% to 28%, even in unselected cohorts and in the absence of family history of cancer.5 10 Additionally, recent studies focused on reviewing tumour syndromes or searching for germline variants related to specific sarcoma histologies, such as osteosarcomas and rhabdomyosarcomas, have pinpointed the relevance of GPVs in cancer predisposing genes (CPGs) in sarcoma aetiology.11–14 However, due to the rarity of both sarcomas and most HCPS, establishing gene-disease associations is challenging, and there may be undiscovered associations, especially in populations under-represented in genomic studies.

Here, we aimed to determine the prevalence of GPVs associated with sarcoma development in young Brazilian patients. We identified that one in five patients with sarcomas under 40 years old presented a GPV, with variable GPV rates across sarcoma subtypes and specific gene-subtype associations. Moreover, we conducted molecular analysis of tumours and analysed a matched control population to provide more evidence of gene-disease associations.


Patient cohort

A total of 177 patients with sarcomas under 40 years old from A.C.Camargo Cancer Center (ACCCC) were included in this study. From 2018 to 2021, 36 patients were included prospectively by the oncogenetics department, the paediatric oncology department or the clinical oncology department. Additionally, we included 141 retrospective patients with blood stored in the institutional Biobank, prioritising patients with osteosarcomas and chondrosarcomas. All study participants provided written informed consent, either specific for the study or for Biobank collection. Genetic testing results were returned to the patients in a post-test genetic counselling.

Germline DNA extraction

Genomic DNA was extracted from peripheral blood leucocytes at the ACCCC Biobank using QiaSymphony equipment (Qiagen, NRW, Germany). For prospective cases, DNA was obtained from saliva using the purification protocol of the PrepIT-L2P kit (PT-L2P) (DNA Genotek, ON, Canada). Genomic DNA quality and quantity were evaluated using NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA) and Qubit 4 Fluorometer (Thermo Fisher Scientific), following the manufacturer’s protocol.

Germline sequencing analyses

Next-generation sequencing (NGS) of the multigene panel

A total of 110 patients were individually evaluated by NGS using a customised panel of 113 or 126 CPGs (online supplemental table 1). Genes were selected based on previous reports from commercial panels of clinical laboratories15 and through the panelapp database.16 Briefly, library construction and gene capture were performed with the QIAseq FX DNA Library Kit (Qiagen) or the Lotus DNA Library Prep kit (IDT Technologies) and custom biotinylated probes (xGen Gene Capture Pools–IDT Technologies). Sequencing was performed on the NextSeq 500 platform (Illumina) to obtain a minimum coverage of 30× in 99%–100% of the target bases.

Supplemental material

Pooled sample screening with the multigene panel

To screen GPVs in a larger number of individuals, we applied a two-dimensional DNA pooling strategy for 67 patients, following the methodology described by Zuzarte et al.17 The DNA from samples was individually distributed in 96-well plates and equimolar pooling of the DNA was carried out in two dimensions, considering the rows and columns of a 96-well plate, generating 8 pools from lines and 12 pools from columns for each plate. The pooled samples were subsequently sequenced using the 113/126 gene panels, as described for the individual samples. This two-dimensional strategy allows us to identify the exact individual carrying rare variants, as when a variant is reported in both one row pool and one column pool, only one sample can be represented in that condition. A variant was considered to be present when the variant allele was detected in >1% of the reads, considering a minimum coverage depth of 100×, in at least one column pool and one row pool.

Variant selection and interpretation

NGS reads were processed using the Isaac Enrichment V.2.1 tool (Illumina) for alignment and variant calling. Variants were annotated and filtered using VarSeq software (Golden Helix). Specific filters were applied to select rare germline small variants (quality ≥30, base coverage ≥30× and variant allele frequency ≥0.25 for individual sequencing, base coverage ≥100× and variant allele frequency ≥0.01 for pool sequencing, minor allele frequency (MAF) ≤0.01 or absent in population databases (GnomAD, ESP, 1000genomes and ABraOM), coding variants or intronic variants up to 10 base pairs of exon limits). Filtered variants were evaluated for their classification in clinical databases and automated classification tools (ClinVar, VarSome and Franklin Genoox) and were classified according to the American College of Medical Genetics and Genomics recommendations.18 CNV detection for individually sequenced samples was performed using VarSeq software (Golden Helix). Briefly, we applied standard software parameters for stringent CNV calling, considering as positive CNV calls regions with a p value <0.001 and a Z score greater than 4 for duplications or less than −4 for deletions. The 113 and 126 panels and described pipelines were previously validated to detect single nucleotide variants, indels and CNVs. All identified pathogenic/likely pathogenic variants (GPVs) were confirmed using amplicon sequencing or a second multigene panel sequencing (for one case with a CNV in the RB1 gene).

XAF1 p.Glu134Ter variant

Germline DNA samples from TP53 p.Arg337His carriers were evaluated to identify the XAF1 p.Glu134Ter, using an NGS amplicon sequencing approach previously described.19 20 This analysis was carried out since XAF1 p.Glu134Ter was recently described to increase cancer risks in p.Arg337His carriers.21

Loss of heterozygosity (LOH) analyses

For the LOH analysis, tumour tissues (frozen or formalin-fixed paraffin embedded (FFPE)) with a minimum of 40% tumour cells were selected, as per the pathologists’ assessment. Genomic DNA was extracted at the ACCCC Biobank, according to standard procedures. PCR and amplicon library construction were performed following the manufacturer’s protocol of the Ion Plus Fragment Library kit (Thermo Fisher Scientific), and sequencing was performed in Ion Proton or S5 platforms (Thermo Fisher Scientific). The allele frequency of the variants was visually verified with IGV software,22 and loss of wild-type allele was considered positive when variant allelic fractions were higher than 65% or 15% higher than the VAF in germline DNA.23 24 This method was previously validated by our group and detected LOH in 90% of breast cancer tissues from BRCA1 GPV carriers (data not shown).

Transcript analysis

We performed gene-specific transcript analysis for one case with an intronic likely pathogenic variant in the EXT1 gene. Transcript analysis was performed as previously described by our group.25 Briefly, RNA was extracted from leucocytes for the patient and control samples at the ACCCC Biobank. cDNA was amplified with primers located two exons upstream and two exons downstream of the variant (exons 6 and 10). Amplicons were sequenced in Ion Proton (Thermo Fisher Scientific), and reads were mapped using CLC Genomics workbench (Qiagen) and STAR. A sashimi plot was generated in IGV software.

Control cohort

We compared the frequency of GPV identified in our patient cohort to the population frequency of GPV in the same genes using a publicly available dataset of Brazilian individuals (ABraOM database). The ABraOM SABE-WGS-1171 dataset contains whole-genome sequencing from 1171 unrelated individuals obtained from a census-based sample of elderly individuals from São Paulo, Brazil.26 We evaluated the 19 genes in which we detected GPV and classified the ABraOM variants following the same criteria described for the patients.

Statistical analysis

Descriptive statistics were used to summarise the results. We used Pearson’s χ2 or Fisher’s exact test for binary variables and the Mann-Whitney test for continuous variables to compare the clinical characteristics between patients with or without GPVs. Two-tailed p values <0.05 were considered significant. For the comparison of patients and control cohorts, Fisher’s exact test was used.


Germline analysis and clinical features

During the study period, we included 177 patients with sarcoma (36 prospective and 141 retrospective patients). The cohort’s mean age was 26 years, and patients were divided into two groups based on the age of onset: 0–14 years (children—C) and 15–40 years (adolescents and young adults—AYA), comprising 20.3% and 79.7% of the patients, respectively. Almost 10% of the patients had more than one primary tumour, and over 40% had a positive family history of cancer reported in medical records.

We compared the clinical features (sex, other primary tumours and family history of cancer) between the C and AYA groups, with no significant difference found (online supplemental table 2). Regarding sarcoma histological subtypes, we observed known expected differences between the age groups, with chondrosarcomas, liposarcomas, synovial sarcomas and leiomyosarcomas being more common in AYAs, while rhabdomyosarcoma was more frequent in children (online supplemental figure 1). It is important to note that we enriched the selection of individuals with osteosarcomas and chondrosarcomas from our Biobank, consequently increasing the frequency of both tumours in our cohort, compared with other subtypes.

Supplemental material

Regarding germline and molecular analyses, figure 1 depicts the general workflow of the study. Overall, 177 patients underwent our NGS panel analysis (110 in individual panel sequencing and 67 in a pooled panel sequencing strategy), and 38 (21.5%) patients presented one or more GPV (figure 2 and online supplemental table 3). The most frequently mutated genes were TP53 (3.9%), CHEK2 (2.3%), BRCA2 (2.3%), and NF1, RB1, MUTYH, SLX4 (1.7% each) (figure 2). In addition to the TP53 p.Arg337His variant, three other identical GPVs were detected in more than one patient: three patients had MUTYH p.Gly368Asp; two patients had RB1 p.Arg251Ter and two patients had MITF p.Glu318Lys. MUTYH and MITF variants are relatively frequent population variants, with MAFs of 0.3% and 0.1% in GnomAD and 0.7% and 0.04% in ABraOM, respectively. Variants of uncertain clinical significance (VUS) were detected in 51% of the patients. Of 178 unique VUS, 22 were detected in autosomal dominant cancer genes and were considered VUS of interest due to in silico prediction of pathogenicity (Revel>0.6) or to Franklin Genoox classification as P/LP (online supplemental table 5).

Figure 1

Overview of the study experimental design, methods and main findings. GPV, germline pathogenic variant; LOH, loss of heterozygosity; VUS, variant of uncertain clinical significance.

Figure 2

Overview of GPV detected in 177 patients with sarcoma. Sarcomas were divided by histological subtype between bone tumours and soft tissue sarcomas. On the left, the genes included in the customised panels are represented and divided according to their association with sarcoma predisposition. # XAF1 is not included in our panel, but alterations in this gene were investigated in patients with TP53 p.Arg337His. On the right, the percentages represent the number of patients detected with GPV in the indicated gene, and at the bottom, the types of alterations are described. The results of tumour LOH analyses are also depicted. Patients are coloured according to age group. Notes: *Recessive genes: all patients were monoallelic heterozygotes. #Patient diagnosed with multiple subtypes of sarcomas (osteosarcoma, leiomyosarcoma and liposarcoma). ASPS, alveolar soft part sarcoma; AYA, adolescents and young adults; C, children; CS, chondrosarcoma; DFSP, dermatofibrosarcoma protuberans; ES, Ewing sarcoma; FDCS, follicular dendritic cell sarcoma; GPV, germline pathogenic variant; LMS, leiomyosarcoma; LOH, loss of heterozygosity; LS, liposarcoma; MFS, myxofibrosarcoma; MPNST, malignant peripheral nerve sheath tumours; NOS, not otherwise specified; OS, osteosarcoma; SFT, solitary fibrous tumour; SS, synovial sarcoma; WT, wild type.

We classified the evaluated CPGs according to their evidence of being related to sarcoma predisposition, as described at The Gene Curation Coalition27 and in the literature review (online supplemental table 1). We identified variants in genes previously associated with an increased risk of sarcoma development, such as TP53, CHEK2, NF1, RB1, EXT1 and EXT2, as well as in genes with emerging/limited associations, such as BRCA2, ERCC2 and ERCC4, or unknown associations, such as MUTYH, SLX4, MITF, AKT1, ERCC3, FANCM, RAD50 and PALB2 (figure 2).

Considering only the definitive and emerging/limited genes, 14.7% (26/177) of patients had a GPV, while 15.2% (27/177) of patients had a GPV in dominant genes associated with high or moderated penetrance cancer predisposition syndromes. Most genes with unknown associations with sarcoma represent genes responsible for recessive cancer predisposing syndromes, and all patients detected with variants in recessive genes had monoallelic heterozygous variants. Of all patients, three presented GPV in more than one gene: one patient with liposarcoma had GPV in PALB2 and MITF, one patient with alveolar sarcoma had alterations in MUTYH and MIFT, and one patient with osteosarcoma had GPV in TP53 and SLX4 (figure 2 and online supplemental table 3).

Five patients (3%—5/177) presented the TP53 p.Arg337His variant, a founder variant frequent in the Brazilian population. We evaluated the XAF1 p.Glu134Ter variant in these patients, as a haplotype harbouring both variants was recently described to increase the risk of sarcomas and other primary tumours in p.Arg337His carriers.21 Four of the five TP53 p.Arg337His were positive for XAF1 p.Glu134Ter, with three of them having a personal and familial history of cancer that fulfilled the 2015 Chompret Li-Fraumeni clinical criteria and none fulfilling the classical Li-Fraumeni criteria (table 1).

Table 1

Clinical and molecular features of five patients harbouring TP53 p.Arg337His

Clinical data were compared between GPV carriers and non-carriers, and the only significant association was that GPV carriers were more likely to present more than one primary tumour (p=0.012) (table 2, online supplemental tables 3 and 4). Next, we compared the GPV detection rate across sarcoma subtypes and revealed that certain subtypes had higher frequencies of GPV, such as spindle cell sarcoma/malignant peripheral nerve sheath tumours (MPNST) (33.3%), chondrosarcoma (30.4%) and leiomyosarcoma (27.8%) (figure 3A). The subtypes with a lower detection of GPV were fibrosarcoma (0%), rhabdomyosarcoma (9.1%) and synovial sarcoma (10.5%). Likewise, we also observed some expected gene-subtype associations. For instance, we detected EXT1 and EXT2 GPV only in chondrosarcomas and RB1 GPV mostly in patients with osteosarcomas (one patient was diagnosed with osteosarcoma, leiomyosarcoma and liposarcoma). Additionally, there was an enrichment of TP53 GPV in leiomyosarcomas and of NF1 in MPNST.

Table 2

Clinical features of 177 young patients with sarcomas with and without GPV

Figure 3

(A) Rates of germline pathogenic variant (GPVs) according to sarcoma subtype. The light blue bars show the total number of cases of each sarcoma subtype and the dark blue bars show the percentage of GPVs identified in each subtype. The spindle cell sarcoma group includes two malignant peripheral nerve sheath tumours (MPNSTs), since both MPNSTs were initially diagnosed as spindle cell sarcomas in the routine diagnosis and were only reclassified as MPNSTs after we identified the NF1 GPVs and performed a pathological review integrating the germline findings. (B) Frequency of GPVs in genes found in this cohort (177 patients with sarcoma) compared with a cohort of healthy elderly individuals (1171 Brazilian individuals from the ABraOM database). P values are shown for Fisher’s exact test, with significant differences (p<0.05) displayed in bold.

Cases with expected non-cancer phenotypes

Some of the genes identified with GPV are expected to cause non-cancer phenotypes that represent diagnostic criteria for the associated syndromes. For instance, NF1 carriers usually present multiple cutaneous neurofibromas, multiple café au lait macules, intertriginous freckling and other features; TSC2 patients can present multiple abnormalities of the skin, brain, kidney, heart and lungs; EXT1/2 patients are expected to present multiple osteochondromas, which are benign cartilage-capped bone tumours. In this study, we identified seven patients with GPV in these genes and investigated their clinical phenotypes (online supplemental table 3).

Out of the three NF1 carriers identified, only one had a prior diagnosis of neurofibromatosis one syndrome (GRY_47), with multiple neurofibromas detected since the age of 11–15 years. The other two patients were followed in our institution for 4 years (2008–2011), and no other clinical sign of NF1 was described in medical records (although none were seen by a geneticist).

One patient with TSC2 GPV had a late diagnosis of tuberous sclerosis (TS) (GRY_42). This patient developed three primary tumours: a sarcoma (not otherwise specified) at age 11–15, a rectal adenocarcinoma at age 26–30 and Xp11 translocation renal cell carcinoma at age 36–40. Also, he had West syndrome in his infancy and colonic polyposis in his 30 s. The clinical diagnosis of TS occurred late, at 36–40 years old, with a dermatological evaluation of facial skin lesions.

Regarding the EXT1/2 genes, three patients with chondrosarcoma presented GPV in these genes, and none of them was previously diagnosed with hereditary multiple osteochondromas, previously called hereditary multiple exostoses. One of the cases (GRY_78, diagnosed with chondrosarcoma at age 36–40) reported removing an osteochondroma at her first decade of life and had a GPV in EXT1. Another patient (GRY_08, GPV in EXT2) was diagnosed with low-grade chondrosarcoma at age 26–30, and no other tumours were reported. The third patient (GRY_52, diagnosed with chondrosarcoma at her early 50s, with her father diagnosed with a chondrosarcoma at his early 20s) was later diagnosed with multiple osteochondromas in the lower limbs, detected by imaging examinations that were requested after the results of the genetic testing showed a likely pathogenic variant in EXT1. This patient variant was located at the splice donor site of intron 8 (c.1722+1G>C), and to confirm the effect of this variant on splicing, we performed sequencing of EXT1 transcripts from blood. This analysis evidenced the complete skipping of exon 8, leading to an inframe deletion of 30 amino acids (r.1633_1722del; p.Met546_Val575del) from the glycosyl transferase domain (online supplemental figure 2). Based on this finding and the specific phenotype of the patient, the variant was reclassified as pathogenic.

Loss of heterozygosity (LOH)

To investigate the occurrence of somatic second hits, we assessed LOH in fresh frozen or FFPE tumours of 23 patients harbouring GPVs and 7 patients harbouring prioritised VUS. Among GPV carriers, we observed LOH of the wild-type allele in 47.8% (11/23) of the cases. Ten of the LOH cases were from patients with GPV in genes definitively associated with sarcomas, such as TP53 (5 out of 6 analysed tumours had LOH), NF1 (3/3), RB1 (1/2) and EXT1 (1/1). Only one LOH case was observed in a gene not previously associated with sarcomas (MITF —1/1) (figure 2 and online supplemental table 3). Moreover, we also searched for LOH in XAF1, and all three evaluated tumours with the TP53-XAF1 haplotype presented LOH of both TP53 and XAF1.

In total, considering the frequency of LOH according to the level of association between CPG and sarcoma risk, LOH of the wild-type allele was detected in 62.5% of the tested variants from genes with known association (10/16), in 0% of variants from genes with emerging/limited association (0/3) and in 16.7% of variants from genes of unknown association (1/6). For one CHEK2 GPV carrier, we observed loss of the pathogenic variant allele. Additionally, we selected 7 of the 22 prioritised VUS for which we had tumour tissues to perform LOH analysis, and all tumours were negative for LOH (online supplemental table 5).

Frequency of GPV compared with healthy individuals

We compared the frequency of P/LP variants identified in our patient cohort to the frequency of these alterations in a control Brazilian population cohort obtained from the ABraOM database (figure 3B and online supplemental table 6). The frequency of GPV detected in the sarcoma cohort was significantly higher in TP53, RB1 and NF1 (genes definitively associated with sarcomas), as well as in two genes without established associations to these tumours, such as BRCA2 and SLX4. Moreover, p values close to the significance level were observed for CHEK2 and MITF.


Our study showed that 21.5% of unselected young Brazilian patients with sarcomas carry GPV in CPGs, with 15.2% of patients presenting GPVs in dominant genes associated with high or moderated penetrance cancer predisposition syndromes. We found GPV in genes already known to be related to sarcomas, but also in genes with an unknown association with sarcomas, such as PALB2, MUTYH, SLX4, MITF, AKT1, ERCC3, FANCM and RAD50. Considering only the genes with a well-established or emerging/limited association with sarcoma risk, 14.7% of patients had GPV. Moreover, an additional 12.4% of the patients had VUS considered damaging by in silico prediction tools in an autosomal-dominant CPG. In our case-control comparison, we confirmed the enrichment for known sarcoma predisposing genes TP53, NF1 and RB1 in patients with sarcoma, as well as for BRCA2, a gene with emerging evidence for sarcoma risks,5 28 and the previously unrelated gene SLX4.

Overall, our findings are consistent with recent studies conducted in sarcoma cohorts of distinct age ranges and tumour types, which have found an average of 20% GPV in patients with sarcoma.5 14 29 In their seminal study, Ballinger et al5 observed that 19% of the 1162 unselected patients with sarcomas (3–93 years old) harboured GPV in 72 CPGs, and almost 50% harboured pathogenic and putative (in silico predicted) pathogenic variants. Moreover, compared with a control population, there was an excess burden of putative GPV in known sarcoma genes, such as TP53 but also in ATM, ATR, BRCA2 and ERCC2.5 In another multigene panel study evaluating 52 CPGs, 13.6% of 66 patients with sporadic sarcomas under 50 years old carried a GPV in genes related and not known to be related to sarcomas (ATM, BRCA2, ERCC4, FANCC, FANCE, FANCI, MSH6, POLE, SDHA and TP53).29 Mirabello et al14 performed exome sequencing of 1244 individuals (2–80 years old) with osteosarcomas and found 28% of them with GPV. In this study, they also performed a variant burden comparison with a control cohort and identified novel candidate genes, such as CDKN2A, MEN1, VHL, POT1, APC and ATRX.14 Lastly, in a very recent case-control study, Gillani et al28 evaluated 141 CPGs in whole genome and exome germline data from 1147 paediatric patients with sarcoma (enriched for Ewing sarcoma, osteosarcoma and rhabdomyosarcoma) and more than 10 000 controls. By testing both pansarcoma and subtype-specific associations, they confirmed the role of genes previously implicated in sarcoma pathogenesis (TP53, DICER1, RB1, RECQL4, NF1) and found evidence of association for genes without substantial prior evidence of sarcoma risk (FANCC, FANCA, PTPN11, RECQL, MUTYH, BRCA2, SDHD).28 In addition, other case reports also described CPGs not previously related to sarcomas and other tumours, such as POT130 and BAP1.31

Our study, although underpowered, provides valuable insight into the genetic predisposition within sarcoma subtypes. Our findings show qualitative gene-subtype associations and variable GPV detection rates across different sarcoma subtypes, varying from 0% to 33% of GPV frequency. For instance, well-known gene-subtype associations were observed in our series, as all RB1 carriers presented osteosarcomas (one with metachronous leiomyosarcoma and liposarcoma), all EXT1/2 carriers developed chondrosarcomas and two out of three patients with NF1 GPV were diagnosed with MPNST. Additionally, TP53 p.Arg337His was more prevalent in patients with leiomyosarcomas, confirming previous descriptions in the Brazilian population32 and the distinct behaviour of this variant compared with other TP53 variants, since leiomyosarcoma corresponds to only 9.1% of the sarcomas in other TP53 GPV carriers.33

The TP53 p.Arg337His variant is a known Li-Fraumeni syndrome (LFS) causing variant associated with a varied risk and penetrance spectrum for different tumours, usually differing its phenotype from the classic LFS.34 It is considered a founder variant in Brazil, as its frequency in this population is higher (MAF=0.001)34 than that in the worldwide population (MAF gnomAD_v2 1.1=0.000012).35 36 A previous Brazilian study investigated the frequency of p.Arg337His in 502 unselected patients with sarcomas (1–91 years old, mean age 40 years) and observed an 8% detection rate, with most carriers being affected by leiomyosarcomas (52.5%) and with ages of onset above 40 years old (82.5%).37 In our study, the detection rate of this variant was 3%, likely due to the earlier ages of onset in our cohort. Moreover, the phenotypes of the probands and their relatives are representative of the variable penetrance and tumour spectrum of this variant since none fulfilled the classical Li-Fraumeni criteria and three out of five fulfilled the 2015 Chompret Li-Fraumeni clinical criteria. Recently, a modifier variant, XAF1 p.Glu134Ter, occurring as a haplotype with the TP53 p.Arg337His was described to increase the risk for sarcoma and subsequent malignancies.21 In that study, the p.Glu134Ter variant was detected in 79% of 203 patients with cancer harbouring the TP53 p.Arg337His,21 and in our study, 80% (4/5) of the patients were harboured this compound mutant haplotype.

We also identified that 1.7% of our cohort had two GPV in distinct genes. This phenomenon was recently described in the literature as multilocus inherited neoplasia alleles syndrome (MINAS).38 39 There are few studies describing MINAS in patients with sarcoma, as revised by McGuigan et al39 including cases with GPV in two autosomal dominant and highly penetrant genes, such as TP53 with PTEN and NF1 with BRCA2.39 In our cases, only one of the three individuals detected with MINAS had one GPV in a highly dominant penetrant gene associated with sarcoma (TP53) combined with a monoallelic GPV in a gene of unknown sarcoma risk (and of recessive inheritance—SLX4). The other two patients harboured GPV in genes not previously related to these tumours (PALB2 with MITF and MUTYH with MITF). Strikingly, two MINAS individuals had a GPV in MITF, a transcription factor related to cell cycle regulation that plays an important role in melanocyte homeostasis, and both patients had the same variant (p.Glu318Lys - MAF ABraOM=0.000427). This rare, moderated penetrance variant is known to be associated with an increased risk of renal cell carcinoma and melanoma40 41 and was also recently associated with carcinosarcoma risk.42 Moreover, a study from a Brazilian cohort from a hereditary cancer registry described a prevalence of 0.9% for this variant (10/1056 patients), with breast cancer being the most common cancer in probands and their relatives and no description of sarcomas.43

Recent studies have suggested that sarcoma susceptibility can result not only from monogenic high-risk variants but also from a combined effect of multiple less penetrant variants contributing to sarcoma risks through a polygenic risk model.5 Specifically, polygenic effects and monoallelic germline variants in DNA damage response (DDR) genes were suggested to predispose patients to young-onset translocation-associated sarcoma subtypes,5 as was later shown to be the case for FANCC variants in Ewing sarcomas.28 Our patients with multiple GPV and patients with monoallelic GPV in DDR genes usually related to recessive cancer predisposing syndromes (such as FANC and ERCC gene families) could represent examples of these polygenic effects. However, confirming these smaller effect associations will require large cohorts of patients and controls, and proposing specific clinical management for such cases would be even more challenging.

Several molecular strategies have been implemented to confirm the role of distinct GPV and genes in the carcinogenesis of a specific tumour. Somatic LOH analysis considers the two-hit model that, in general, occurs in TSG inactivation, where the wild-type allele can be lost in tumour tissue due to large deletions. For sarcomas, LOH is expected to occur in TSG where its causal association is well defined, such as NF1, RB1 and TP53, as we have seen in our study. Nevertheless, some genes can present more complex mechanisms of tumour initiation, as described for EXT1/2 genes in chondrosarcomas, where a mosaic loss of these proteins is necessary to drive tumourigenesis.44 45 Of our three EXT1/2 cases, LOH was only evaluated in one case, which was positive for wild-type allele loss. Moreover, for genes with unknown association with sarcomas, we detected LOH of the MITF variant in a patient with liposarcoma with MITF and PALB2 GPV. This case illustrates that LOH data should be interpreted with caveats, especially in genomically unstable tumours such as some sarcomas, since MITF is considered to influence oncogenesis through a gain-of-function activity, and a second hit is not expected.

We acknowledge that our study has limitations. First, our cohort presents a selection bias since Biobank patients were selected to enrich for patients with bone sarcomas. Second, the methodology we used was panel sequencing, which did not assess variants in non-coding regions of CPGs and in other genes that could be involved in sarcoma tumourigenesis, and for the samples analysed using the pool strategy, CNVs were not evaluated. Third, our method to infer LOH does not allow us to differentiate copy-number neutral from losses and gains leading to allelic imbalance, nor can it detect other events leading to gene inactivation (such as frameshifts/stop gains point mutations or promoter methylation). Nonetheless, in comparison to previous studies, we performed NGS using large multigene panels, included individuals across sarcoma subtypes and with an earlier onset (both paediatric and young adults), from a genetically admixed and under-represented population, and performed tumour LOH analysis for pathogenic variants and selected VUS, rendering important insights into sarcoma genetic predisposition.

In summary, our findings reveal that in our cohort one-fifth of patients with sarcomas under 40 years old had a GPV in a CPG, with one in each seven having a GPV in genes from dominant cancer predisposition syndromes. Moreover, we showed that GPV rates vary widely across sarcoma subtypes. Our findings have implications for clinical practice and underscore the importance of genetic screening in patients with sarcoma, particularly in younger patients, even in the absence of family history or clinical signs associated with a syndrome. Additionally, our results draw attention to the inadequacy of current recommendations for genetic screening in patients with sarcoma, which would miss a substantial number of patients with GPV.46–48 Given the significant heterogeneity among sarcoma subtypes, it is likely that subtype and even population-specific guidelines would be more appropriate. Nevertheless, until larger studies can identify optimal testing strategies, it would not be unreasonable, considering the latest findings, to recommend genetic consultation and testing for all patients with sarcoma.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

This study was reviewed and approved by A.C.Camargo Cancer Center Institutional Review Board (IRB) (approval number CAAE: 90036518.6.0000.5432). Participants gave informed consent to participate in the study before taking part.


We acknowledge the patients who participated on the study, and the A.C.Camargo professionals and Biobank, Sequencing and Bioinformatics facilities.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @nathangelis, @nirvanaformiga, @BRivera_Polo, @gitorrezan

  • Contributors Conceptualisation: DMC and GTT. Formal analysis: NdAdC, KMS, FDC and GTT. Methodology: NdAdC and KMS. Investigation: JMLM, MNF, DCdSQ, DP, CMLdC, JCCdR, CALdM. Funding acquisition: DMC and GTT. Supervision: GTT. Resources: BR, DMC and GTT. Writing—original draft preparation: NdAdC and GTT. Writing—review and editing: KMS, FDC, JMLM, MNF, DCdQS, DP, CMLdC, JCCdR, CALdM, BR, DMC and GTT. Guarantor: GTT. All authors have read and agreed to the published version of the manuscript.

  • Funding Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP 2014/50943-1, 2018/06269-5 and 2018/17027-2); Conselho Nacional de Desenvolvimento Científico e Tecnológico (465682/2014-6 and 426835/2018-2) Coordination for the Improvement of Higher Education Personnel (CAPES - 88887.136405/2017-00, 88882.366031/2019-01 and 88887.509818/2020-00).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.