Article Text

Download PDFPDF

Original article
Genetic linkage analysis of a large family identifies FIGN as a candidate modulator of reduced penetrance in heritable pulmonary arterial hypertension
Free
  1. Pau Puigdevall1,
  2. Lucilla Piccari2,
  3. Isabel Blanco2,3,
  4. Joan Albert Barberà2,3,
  5. Dan Geiger4,
  6. Celia Badenas2,5,
  7. Montserrat Milà2,5,
  8. Robert Castelo1,6,
  9. Irene Madrigal2,5
  1. 1 Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
  2. 2 Hospital Clínic de Barcelona, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
  3. 3 Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain
  4. 4 Faculty of Computer Science, Technion Israel Institute of Technology, Haifa, Israel
  5. 5 Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
  6. 6 Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Spain
  1. Correspondence to Dr Robert Castelo, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona 08002, Spain; robert.castelo{at}upf.edu

Abstract

Background Mapping the genetic component of molecular mechanisms responsible for the reduced penetrance (RP) of rare disorders constitutes one of the most challenging problems in human genetics. Heritable pulmonary arterial hypertension (PAH) is one such disorder characterised by rare mutations mostly occurring in the bone morphogenetic protein receptor type 2 (BMPR2) gene and a wide heterogeneity of penetrance modifier mechanisms. Here, we analyse 32 genotyped individuals from a large Iberian family of 65 members, including 22 carriers of the pathogenic BMPR2 mutation c.1472G>A (p.Arg491Gln), 8 of them diagnosed with PAH by right-heart catheterisation, leading to an RP rate of 36.4%.

Methods We performed a linkage analysis on the genotyping data to search for genetic modifiers of penetrance. Using functional genomics data, we characterised the candidate region identified by linkage analysis. We also predicted the haplotype segregation within the family.

Results We identified a candidate chromosome region in 2q24.3, 38 Mb upstream from BMPR2, with significant linkage (LOD=4.09) under a PAH susceptibility model. This region contains common variants associated with vascular aetiology and shows functional evidence that the putative genetic modifier is located in the upstream distal promoter of the fidgetin (FIGN) gene.

Conclusion Our results suggest that the genetic modifier acts through FIGN transcriptional regulation, whose expression variability would contribute to modulating heritable PAH. This finding may help to advance our understanding of RP in PAH across families sharing the p.Arg491Gln pathogenic mutation in BMPR2.

  • clinical genetics
  • reduced penetrance
  • heritable pulmonary arterial hypertension
  • linkage
  • genetic modifier

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Pulmonary arterial hypertension (PAH) is a rare disease characterised by an abnormal rise in mean pulmonary arterial pressure (≥25 mm Hg at rest), which leads to a progressive increase in pulmonary vascular resistance and ultimately to death, due to right ventricular failure.1 From 2% to 4% of PAH patients suffer from heritable pulmonary arterial hypertension (HPAH), with an overall HPAH prevalence below one case per million adults.2 3 PAH was initially described by Dresdale et al,4 and its hereditary subtype is defined by either the presence of a known genetic defect linked to the disease or by a positive family history. Despite substantial progress in diagnosing HPAH, long-term survival for most patients is extremely poor.

Mutations in the bone morphogenetic protein receptor type 2 gene (BMPR2),5 6 a transforming growth factor beta (TGF-β) superfamily member, have been detected in 75%–80% of PAH cases.7 Other genes have been implicated in the disease, mainly involving the TGF-β signalling pathway (ALK1,8 ENG 9 and SMAD8 10), as well as other genes (CAV1,11 KCNK3 12). The molecular mechanisms by which BMPR2 mutations cause the disease are still not fully understood. HPAH is inherited as an autosomal dominant disease. However, not all BMPR2 mutation carriers develop the disease, highlighting the presence of reduced penetrance (RP). Consequently, a primary BMPR2 mutation is necessary, but not sufficient by itself, to cause PAH. The penetrance of PAH among BMPR2 mutation carriers is estimated to be around 20%.13

There are several proposed mechanisms to explain RP14 in human disease, including genetic modifiers, the molecular context of mutations, patient characteristics such as age or sex, or environmental conditions. In the case of PAH, BMPR2 mutations alter multiple pathways supporting the hypothesis that additional genetic factors may confer susceptibility onto BMPR2 carriers.15 In recent years, many efforts have thus focused on finding those genetic modifiers.16–21 However, most of these studies identify different genetic mechanisms, supporting a wide heterogeneity among the acting modifier loci. This suggests that new modifiers are likely to be found as more affected families are investigated.

We studied a large Iberian family (n=65 subjects, five generations) affected by PAH and segregating with the BMPR2 missense mutation p.Arg491Gln (rs137852749, c.1472G>A).22 The penetrance of PAH among BMPR2 mutation carriers is 36.4%. Sequence-level characteristics may contribute to this increased penetrance because this particular variant occurs in a highly conserved kinase domain position of a superfamily type II TGF-β receptor.23 This mutation leads to a dominant-negative inhibition effect on the wild-type receptor function, altering the BMPR2 heteromerisation and impacting the selection of specific signalling pathways downstream.24

Methods

Subjects and phenotypic characterisation

The family studied comprises 65 Spanish individuals spanning five different generations.22 Eight of the individuals were diagnosed with PAH, confirmed by right-heart catheterisation, showing a mean pulmonary arterial pressure ≥25 mm Hg. DNA extractions and mutational studies of the BMPR2 gene from whole blood were carried out following standard procedures.25 The pathogenic BMPR2 mutation c.1472G>A (p.Arg491Gln)23 segregated with the disease in an autosomal dominant mode of inheritance (MOI). All 8 PAH subjects carried the BMPR2 mutation, while the other 14 carriers were healthy at the time of examination. The remaining individuals were reported as healthy non-carriers, except for eight individuals whose disease and/or genetic status were unknown, including four obligate carriers due to carrier offspring and suspicion of PAH, for which haemodynamic confirmation was not available (S01, S06, S08 and S10). Living subjects carrying the mutation within the family are regularly followed up by their specialists of reference.

Genotyping and data preprocessing

We genotyped 32 individuals selected from the family being studied, with the Illumina microarray chip Infinium CoreExome-24 BeadChip v1.1 (n=5 51 839 SNPs). Genotype calling and sample clustering were performed with the Illumina BeadStudio software. Five individuals presented a genotyping call rate <0.99 (labelled in grey: T25, T26, T27, T28 and T29) and, along with four non-genotyped relatives (S12, S13, S30 and S33), were discarded from further analysis, leaving 56 out of the initial 65 members.

We filtered out autosomal and X-linked variants that were not biallelic SNPs, as well as biallelic SNPs with either missing genotypes in more than six samples or that could not be unambiguously annotated to dbSNP Build 144. We also removed variants that lacked the genetic and physical position provided by IIlumina, as well as population allele frequency (AF) data from the ExAC26 release 0.3 or the 1000 Genomes Project,27 Phase 3. All reported genomic coordinates used the GRCh37/hg19 genome version as reference.

The PLINK28 Mendelian error tool was also applied to the dataset to remove segregation inconsistencies. For chromosome X, an additional filter discarded all variants containing heterozygous genotypes in men. For pseudoautosomal regions (XY), only variants falling within the PAR1 region were considered. Within that region, the Rutgers (v3) genetic sex-specific map29 was used instead of the IIlumina coordinates. Following these preprocessing steps, 491 896 SNPs remained in the dataset (480 028 autosomal, 11 759 from chromosome X and 73 from XY-PAR1 region; see online figure S1).

Supplemental material

Linkage analysis

We performed genome-wide linkage analysis with the previously filtered SNPs. To assess the robustness of the observed results, linkage analysis was conducted using five independent software programmes based on two-point (Mendel,30 Pseudomarker31 and Superlink32) and multipoint (Merlin33 and Morgan34) linkage approaches (online figure S1).

The software Mendel30 used all the filtered variants as input, while Pseudomarker31 was unable to process variants in the PAR1 region. Mendel ran a two-point linkage under the Location_scores option using two flanking points. Pseudomarker was run using the runPM programme (see Web resources), which internally removes all the monomorphic variants present in the input. Superlink32 reduced the number of input variants based on various clustering, cleaning and filtering algorithms.

For multipoint linkage, we first ran parametric and non-parametric tests in Merlin.33 We were forced to prune the family pedigree (67 bits) due to the complexity upper bound of the Lander-Green algorithm implementation in Merlin (24 bits). Accordingly, we selected a subtree of the family pedigree that maximised the statistical power of linkage, as follows. We first prioritised the selection of genotyped affected carriers, and then included as many genotyped healthy carriers as possible. Finally, we included genotyped healthy non-carrier individuals (online figure S2). The input number of SNPs was drastically reduced by a linkage disequilibrium (LD) filtering performed with PLINK 1.07 (n=33 378 SNPs), using a window size of 10 SNPs, a sliding window of 5 SNPs and a variance inflation factor threshold of 2 units. We also incorporated the ExAC minor allele frequencies (MAF) into the model. Where these values were not available, we used MAF data from the 1000 Genomes Project. We also ran multipoint linkage with the whole pedigree using Superlink,32 restricted to SNPs within chr2:161–167 Mb. Given the size of the pedigree, we could only ran non-overlapping windows of a maximum size of five SNPs. Because calculations are sensitive to the subset of SNPs within the sliding window, we performed five different runs beginning at each of the five possible starting SNPs of the analysed region.

The software Morgan34 was also used to estimate multipoint LOD scores. This programme is not restricted by pedigree complexity, as it uses the Markov Chain Monte Carlo algorithm, although the number of SNPs is computationally limiting. Consequently, we ran an analysis using a window of 300 SNPs on the regions of interest highlighted by Merlin.

We began the linkage analysis by verifying, as a quality control, the presence of significant linkage at the BMPR2 locus selecting BMPR2 mutation carriers as affected individuals. We then searched for a linkage signal from another locus with an independent contribution to the disease, whose genotype differentiates affected carriers from the rest of the individuals (online Supplemental methods). Finally, we considered a susceptibility model that could explain the low penetrance in HPAH (online figure S3). Under this model, all affected BMPR2 carriers possess a genetic modifier that increases the risk of the disease as part of a two-hit mechanism, while healthy non-carriers could have the modifier without altering their risk of PAH.

Under a susceptibility model, there are four different scenarios, which depend on the AF (common or rare in the population) and the MOI (dominant or recessive). We discarded two of these scenarios because they were extremely unlikely (rare/recessive, common/dominant). Another scenario (rare/dominant) was discarded because it assumed a two-hit occurrence of rare variants. Consequently, the parametric model for the susceptibility tests was built using a common AF (d=0.999) and a recessive MOI (see online Supplemental methods). We also permitted the existence of locus heterogeneity by defining a low phenocopy rate of 2%.

The age at disease onset is a particular confounder for the clinical condition as healthy carriers can develop the condition later in life and thus switch to affected carriers. The standard approach to control for this age-dependent penetrance effect in linkage analysis is through liability classes. However, the age-of-onset exhibits a very large variability, ranging from 5 to 75 years, and appears to be affected by genetic anticipation. This makes a general penetrance function more appropriate than liability classes.

The individuals most at-risk of developing the condition are the youngest healthy carriers. Within our family, three healthy carriers are <10 years of age (T20, T21 and T24; online table S1). Given this uncertainty, we ran several linkage tests including all the family (or the fraction allowed by Merlin) and the seven pedigree combinations created by leaving aside the three youngest healthy carriers. This included the removal of only one individual at a time (three combinations), each pair of individuals (three combinations) and excluding all three at once (one combination). We chose this strategy to avoid direct speculation on individual phenotypes and due to the limited number of young individuals to build liability classes based on the age of onset.

Results

Clinical evaluation of p.Arg491gln mutation carrier individuals

We studied a large multiplex family composed of 65 members (figure 1), including 22 carriers of the pathogenic BMPR2 mutation c.1472G>A (p.Arg491Gln). At the time of the examination, eight of these individuals were affected. Table 1 shows the clinical and haemodynamic characteristics of healthy and affected carriers. Unfortunately, the available data do not allow us to compare the haemodynamic trajectories between healthy and affected carriers for several reasons. First, affected carriers have no pulmonary arterial pressure (systolic) (PAPs) echocardiogram values because they were diagnosed as PAH symptoms occurred, with the exception of two individuals. Second, the currently healthy carriers who were screened by echocardiogram either did not show the tricuspid regurgitation needed to measure PAPs values or data were not available. While pulmonary arterial pressure (mean) (PAPm) values are available for affected carriers, only one healthy carrier with suspected PAH, underwent right-heart catheterisation (T29). As expected, the haemodynamic values (PAPs, PAPm and pulmonary vascular resistance) of this healthy carrier were very different from the average values among affected carriers.

Figure 1

Family tree with HPAH disease cosegregation. Identifiers of genotyped individuals begin with the capital letter T, while non-genotyped begin with S. Solid black dots indicate affected mutation carriers while vertically striped dots indicate asymptomatic carriers and open dots correspond to healthy non-carriers. Grey dots indicate individuals of unknown phenotype and/or genotype while crossed dots correspond to deceased individuals. S01, S06, S08 and S10 individuals are obligate carriers due to carrier offspring. HPAH, heritable pulmonary arterial hypertension .

Table 1

Clinical and haemodynamic characteristics of carriers of the BMPR2 p.Arg491Gln mutation

Table 1 also shows that females are more affected than males, as previously reported.22 This difference is also observed when comparing the penetrance of female (45.5%) to male carriers (27.3%). The age of onset in this family is highly variable, spanning from 5 to 75 years (figure 2), decreasing with each new generation. More concretely, in generation II, PAH was diagnosed in one patient when he was 75.4 years old, in generation III at 47.1±5.2 years (n=3) and in generation IV at 18.16±12.58 years (n=4); see figure 2. This pattern is consistent with genetic anticipation, although clinical improvements in the diagnosis and earlier tracking of younger carriers may confound this observation.

Figure 2

Years free of PAH among healthy and affected carriers. Cumulative penetrance on the y-axis as a function of the years free of PAH on the x-axis. The age of individuals marked with a double asterisk (**) was inferred from the age of their oldest child plus 18 years. A smoothed local regression line is shown in blue. HPAH, heritable pulmonary arterial hypertension; PAH, pulmonary arterial hypertension.

Another mechanism that could explain the RP is genomic imprinting (online Figure S4; see Supplemental Methods). Despite the differences observed in the penetrance of carrier individuals with paternal transmission (f; 50%) and maternal transmission (m; 25%), the parental origin of the mutation does not entirely explain the phenotype of the progeny.

PAH susceptibility among BMPR2 p.Arg491Gln mutation carriers is significantly linked to region 2q24.2-q24.3

Parametric linkage validated the BMPR2 carrier status as expected, and did not provide any evidence for an independent genetic contribution to PAH disease within the family (online Supplemental Methods; figures S5–S11). Under the susceptibility model applied to the whole pedigree (n=56), or to the subpedigree in Merlin (n=30), there was no genomic region of significant linkage (two-point linkage in online figures S12A and S13A; multipoint linkage in online figures S14A and S15).

We then considered the age-dependent penetrance effect on the youngest healthy carriers (T20, T21 and T24) by trimming one, two or three of those subjects at a time from the pedigree. The results for these additional tests revealed a variant at 2q24.2 (rs17716942) with significant two-point linkage for the same two combinations of individuals using both Mendel (online figure S12C and G) and Pseudomarker (online figure S13C and G). Those two combinations correspond to the trimming of T21 (max LOD=4.14), and of the pair T21–T24 (max LOD=3.86). Additionally, Pseudomarker reported other variants with weaker (but still significant) linkage, although Mendel did not reproduce them. This is also the case for variant rs6436140 detected at 2q35 (online figure S13A, C, D and G).

To test the robustness of the previous results, we conducted a multipoint linkage analysis with Merlin. After pruning the pedigree (see the Methods section), using the same parametric recessive model, we observed a 4 Mb region of significant linkage in chromosome 2 (161–165 Mb, 2q24.2-q24.3, online figure S14). This candidate region is located 38 Mb upstream from BMPR2, comprises 27 SNPs (LOD >3.3) and, as with two-point linkage, was again found when using the two pedigrees that discard T21 (max LOD=4.09) and T21–T24 (max LOD=3.79). No other signal was reported by the remaining combinations. As for the Morgan multipoint linkage Markov chain Monte Carlo approach, this partially reproduced the results observed in Merlin, but showed suggestive rather than significant linkage at 2q24.2 and within 2q24.3-q31.1 (LOD=2.94) (online figure S15). Parametric multipoint linkage with Superlink32 using the whole pedigree without T21 (see Methods) also found significant linkage (max LOD=3.80) within this candidate region (online supplemental figure S16). The non-parametric analysis tests ran with Merlin (NPL statistics coupled with the Kong and Cox exponential method) did not support the candidate region. However, the regions detected using the parametric method with two-point and multipoint linkage clearly overlap at the same region, 2q24.2 (figure 3A).

Figure 3

Linkage analysis results. (A) LOD score profiles from different linkage analysis software in the genomic region between the highest LOD and the BMPR2 locus. The profile Merlin-subpedALL was calculated from the subpedigree shown in online figure S1. Merlin-npl-exp is the non-parametric linkage with the same individuals as in Merlin-subpedALL. Profiles marked with ‘no-Tx’ were calculated excluding the ‘Tx’ individuals. In the case of Mendel and Pseudomarker, only SNPs with significant LOD score are shown. Except for Morgan, all programs highlight a region of significant linkage with the combinations that remove the healthy carriers T21 and T24. (B) LOD score profile from Merlin-noT21 in the 7 Mb region of significant LOD. The SNPs track shows the SNPs used by Merlin in the parametric multipoint linkage analysis. The genome-wide association study (GWAS) track shows SNPs from the GWAS Catalog that are associated with blood pressure (rs13002573, rs1446468, rs16849225 and rs11891401). The shaded region highlights SNPs with the strongest linkage (LOD=4.09), which includes the FIGN gene.

These results suggest that the region 2q24.2-q24.3 is a strong candidate to harbour a genetic modifier that modulates HPAH susceptibility. We identified 12 annotated protein-coding genes within this region (figure 3B). Among them, FIGN was found to be one of the best candidates, as the local maximum score of the candidate region overlaps a variant (rs12692701) from the intron 2 of the annotated isoform NM_018086.3 of this gene. Interestingly, in the region of maximum linkage (LOD=4.09), several SNPs are associated with blood pressure according to the traits reported in the genome-wide association study (GWAS) Catalog.

The candidate susceptibility region (2q24.2-q24.3) is enriched by SNPs from the GWAS Catalog associated with blood pressure

We analysed the functional impact of the 529 SNPs located within the candidate region of LOD >3.3 and in LD with annotated SNPs from the GWAS Catalog.35 More concretely, we performed an enrichment analysis of these SNPs with respect to all the SNPs in the human genome for which we could map experimental factor ontology (EFO) annotations from the GWAS Catalog; see online Supplemental Methods. We observed 24 EFO terms that were significantly enriched by SNPs in the candidate region (adjusted Holm p<10−3, Size >5, Count >5, OR >2, one-tailed Fisher’s exact test) when compared with the entire GWAS Catalog (online table S2). Interestingly, EFO terms associated with cardiorespiratory phenotypes were significantly enriched (EFO:0004325: blood pressure, EFO:0006335: systolic blood pressure, EFO:0005763: pulse pressure measurement, EFO:0006995: response to diisocyanate and EFO:0000270: asthma), reinforcing the hypothesis of a potential contribution of that region to blood pressure.

When conducting the same analysis on the subregion with strongest linkage (LOD=4.09; figure 3B), 15 EFO terms showed as significant (table 2). Among them, the aforementioned cardiorespiratory phenotypes remained on the list. Seven GWAS SNPs, located upstream from the FIGN gene, were responsible for this enrichment: rs16849225,36 37 rs1189140138 and rs144646839 for systolic blood pressure; rs1300257340 and rs1684921139 for pulse pressure measurement; rs76833157 and rs7576072 for both asthma and response to diisocyanate.41 Furthermore, one variant from a dbGAP study was associated with tunica media wall thickness of carotid arteries (rs1370493, pha003034-phs000221).42 These enriched EFO terms refer to systemic cardiopulmonary vasculature, and not to the specific pulmonary vasculature, which would be the most relevant for PAH physiology but which is unfortunately absent from the GWAS Catalog and EFO term annotations.

Table 2

Functional enrichment analysis on the region of maximum linkage. EFO terms enriched by 137 SNPs from the region with LOD=4.09 (chr2:163738883–165107298). EFO terms associated with cardiorespiratory traits are highlighted in bold

Conversely, the enrichment of cardiorespiratory phenotypes was not observed when selecting SNPs within the boundaries of the FIGN gene (online table S3). Upstream of FIGN, partially within the region of strongest linkage, a long intergenic non-coding RNA (lincRNA), ENSG00000237844, was exclusively enriched by cardiorespiratory phenotypes (online table S4). However, this lincRNA shows no expression in almost all tissues from the GTEx project,43 excluding a mechanism mediated by this RNA molecule.

Common variation in the region of maximum linkage (2q24.3) impacts regulatory elements that potentially act as a distal promoter of the FIGN gene

We examined whether the seven GWAS SNPs associated with cardiorespiratory traits in the region of maximum linkage overlap with regulatory elements upstream from the FIGN gene. We refer to this set of SNPs as candidate regulatory SNPs (CRS).

Hi-C data44 from the IMR-90 cell line derived from human fetal lung tissue show that both these CRSs and the FIGN gene are located within the same topologically associated domain (figure 4A). In GTEx data, three of the CRSs (rs13002573, rs16849225 and rs16849211) are cis-eQTLs of the FIGN gene in aortal artery, oesophagus mucosa, skeletal muscle and pancreas tissues (figure 4B). Although none of these GTEx tissues correspond to pulmonary microvascular endothelial cells (PMVCs)—the tissue most relevant to PAH—these cis-eQTLs suggest the presence of molecular regulatory mechanisms that may be shared across tissues. The three CRSs are also located in the same LD block according to the web tool HaploReg,45 which shows that some of the linked SNPs alter regulatory motifs and overlap enhancer histone marks (online figure S17). Chromatin state segmentation from the Roadmap Epigenomics Mapping Consortium (REMC) in the IMR-90 cell line and lung tissue shows changes between quiescent, heterochromatic, zinc finger and flanking promoter states across the LD block (online figure S18).

Figure 4

Common variation upstream from FIGN contributes to its regulation and its role as HPAH penetrance modulator. (A) Hi-C data from the human IMR-90 cell line derived from fetal lung tissue. Vertical yellow stripes highlight SNPs from the GWAS catalogue associated with cardiorespiratory traits, which fall within the same topologically associated domain of the FIGN gene. (B) GTEx FIGN cis-eQTLs located at the region of maximum linkage (LOD=4.09). The first four tracks show raw p values, in logarithmic scale, from eQTLs overlapping SNPs associated with cardiorespiratory traits in the GWAS  Catalog. The shaded region highlights an enrichment of eQTLs within a linkage disequilibrium (LD) block that includes three candidate regulatory SNPs (CRS: rs16849225, rs16849211, rs13002573) associated with systolic blood pressure and pulse pressure measurement. (C) Genetic effect on gene expression of FIGN for the cis-eQTL in the rs4667728 SNP. (D) Functional genomics evidence within the LD block of FIGN cis-eQTLs. The vertical yellow stripe highlights SNP rs4667728 predicted by CENTIPEDE annotations to alter protein binding by integrating REMC DNaseI data with protein DNA-binding motifs. This variant also overlaps a REMC methylation signal from the same IMR-90 cell-line as the DNaseI data. HPAH, heritable pulmonary arterial hypertension.

One of the linked SNPs in the LD block (rs4667728) is a cis-eQTL of FIGN (figure 4C). According to CENTIPEDE annotations46 (online figure S18), this common variant affects protein binding in an open chromatin region with evidence of REMC data from DNaseI of a fetal lung donor. Moreover, this same variant overlaps with a DNA methylation signal from REMC bisulfite sequencing data of the IMR-90 cell line (figure 4D). The altered motif, predicted by CENTIPEDE, derives from the DREB1B factor, a transcriptional activator in Arabidopsis thaliana whose DNA-binding domain superfamily (IPR016177) is, according to the InterPro database, homologous to the methyl-CpG DNA binding (MBD) domain (IPR001739). MBD-containing proteins, such as MeCP2, MBD1 and MBD2 preferentially bind to methylated CpG, such as that overlapped by the rs4667728 SNP, whose reference allele is the corresponding cytosine.

To investigate whether the transcriptional regulation of FIGN is associated with PAH expressivity, we have analysed publicly available gene expression data from a study47 that collected samples derived from the lung tissue of PAH patients and from the normal tissue of cancer patients undergoing surgical lobectomy. In a differential expression analysis between these two groups of samples, we found that FIGN is significantly overexpressed (1.3-fold) in PAH patients, after correcting for multiple testing at 1% FDR (online figure S19). We also found that FIGN is inversely and significantly co-expressed with BMPR2 (p=0.02), using a linear model that adjusts for the PAH effect in FIGN expression (online figure S19); see online Supplemental Methods.

All linked SNPs in the LD block, including rs4667728, have a common MAF of around 0.22 in the European population of the 1000 Genomes Project (online figure S17). The linkage signal and the described functional genomics evidence thus both support the hypothesis that common variation in the candidate susceptibility region (2q24.3) may impact regulatory elements acting as a distal promoter (250–500 kb) of the FIGN gene, whose expression would contribute to modulating HPAH penetrance.

Haplotype prediction

Predicted haplotypes in chromosome 2 were consistent with the BMPR2 carrier status (online figure S20; see online Supplemental Methods). We observed that no healthy carriers were predicted to have a recombination between the FIGN (first candidate region) and BMPR2 loci, except for T21. However, half of the affected carriers in the subpedigree (T14 and T11) do present such a recombination. Consistent with these observations, the variant with highest LOD score in two-point linkage (rs17716942) was heterozygous among all healthy carriers, except for T21, and homozygous in half of the affected carriers in the sub-pedigree (T14 and T11).

Discussion

The region defined by the SNPs with LOD >3.3 in multipoint linkage is annotated with 12 protein-coding genes; among these, the FIGN gene was the only gene overlapping the markers with the highest LOD score. Moreover, one intronic variant (rs12692701) supports a direct linkage to the gene. However, while this region is functionally enriched with SNPs associated with cardiorespiratory phenotypes, these are not located within the gene boundaries of FIGN, but in its promoter region. Concretely, multiple sources of functional genomics data—GTEx cis-eQTLs, Hi-C interactions and REMC DNaseI and methylation data—support the hypothesis that some of these SNPs may contribute to the regulation of FIGN expression in relevant tissues for vascular aetiology.

Mukherjee et al 48 characterised FIGN as a microtubule severing enzyme as well as a depolymerase, predominantly involved in mitosis but also contributing to cell migration and neuronal development. FIGN controls the conserved localisation of microtubules to centrosomes, with a crucial role in avoiding the formation of aneuploid or polyploid cells. The same study observed that the depletion of FIGN resulted in alterations in the structure of mitotic centrosomes. However, Wang et al 49 have recently identified a FIGN intronic variant (rs2119289) that has a protective role against congenital heart disease. This finding is compatible with the role of FIGN as contributing to modulating HPAH penetrance.

The mere existence of phenocopies in the susceptibility model implicitly assumes that HPAH may also be caused by genetic modifiers other than that which we found in the candidate region. Such locus heterogeneity in HPAH has already been reported by many different studies. These include the alteration of BMPR2 15 16 and CBLN2 17 gene expression, second-hit mechanisms of BMPR2 with its promoter19 or with EIF2AK4 mutations,18 as well as somatic chromosome abnormalities in the affected lungs.20 These multiple findings support an oligogenic architecture of HPAH.

In summary, we found the FIGN gene to be a candidate modulator for RP in HPAH within the largest family reported to date segregating with the BMPR2 pathogenic mutation c.1472G>A (p.Arg491Gln). Multiple sources of functional genomics data strongly support the most likely underlying genetic mechanism to be located in a regulatory region upstream from the FIGN gene, forming part of its distal promoter and modulating FIGN expression through common variation. A similar mechanism has been observed for familial neuroblastoma, in which genetic modifiers of penetrance for a pathogenic mutation in the ALK gene are identified within the same chromosome arm 2q.50

Our study has several limitations and future research will be required to replicate and confirm the findings reported. First, we were unable to conduct a genome-wide analysis of the entire pedigree at once with currently available multipoint linkage analysis tools due to the algorithmic constraints on large families. Second, microarray genotyping data are sparse and whole-genome sequencing (WGS) data will be required to identify the specific variant, or variants, that act as genetic modifiers of HPAH within the candidate region. WGS data will also be useful to better resolve the individual haplotypes from chromosome 2 in mutation carrier individuals, where long-read sequencing technology would be the most reliable option. Third, we could not obtain PMVCs, the PAH target tissue, from carrier individuals in the family. This tissue would allow us to profile the most relevant RNA and protein expression, which could help to characterise the molecular mechanism triggered by the proposed genetic modifier. Finally, further functional in vitro and in vivo assays will be required to validate this molecular mechanism.

Acknowledgments

This study was supported by the Spanish Instituto de Salud Carlos III (ISCIII) [PI15/00483], co-financed by Fondo Europeo de Desarrollo Regional (FEDER) ’una manera de hacer Europa', Catalan AGAUR [SGR17-1134; SGR17-1020; FI-DGR 2015], Spanish MINECO/FEDER (TIN2015-71079-P) and the ‘Maria de Maeztu Unit of Excellence’ (MDM-2014-0370). The ‘CIBER de Enfermedades Raras’ is an initiative of the ISCIII. We want to thank the ’CERCA Programme' from the Autonomous Catalan Government. We also thank Manuel López-Meseguer and Luis Borderías for providing clinical data, and Roger Piqué-Regi for assistance in using CENTIPEDE. We acknowledge the NCBI database of Genotypes and Phenotypes (dbGaP, http://www.ncbi.nlm.nih.gov/gap). Controlled-access genotype and phenotype data were obtained following authorisation by the appropriate Data Access Committee to GTEx data (dbGaP Study Accession number phs000424.v7.p2).

References

Footnotes

  • Contributors IM, MM, CB and RC designed the project. IM, MM, CB, LP, IB and JAB collected the data. PP, DG and RC analysed the data. PP, CB, IM, LP and RC drafted the manuscript. All authors revised and approved the manuscript.

  • Funding This work was supported by the Spanish Instituto de Salud Carlos III (ISCIII) [PI15/00483], co-financed by Fondo Europeo de Desarrollo Regional (FEDER) ’una manera de hacer Europa', Catalan AGAUR [SGR17-1134; SGR17-1020; FI-DGR 2015], Spanish MINECO/FEDER (TIN2015-71079-P) and the ‘Maria de Maeztu Unit of Excellence’ (MDM-2014-0370). The ‘CIBER de Enfermedades Raras’ is an initiative of the ISCIII. We want to thank the “CERCA Programme” from the Autonomous Catalan Government.

  • Competing interests None declared.

  • Ethics approval The study was approved by the ethical committee of the Hospital Clínic de Barcelona. All participants provided written, informed consent.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Genotype and phenotype data reported in this paper are available through the European Genome-phenome Archive (EGA) under accession number EGAS00001003123.

  • Correction notice This article has been corrected since it was published online first. A technical error in the PDF (figure 3 was not displaying) has been corrected.

  • Patient consent for publication Not required.