Introduction

Autosomal codominant hypercholesterolemia (ADH, OMIM no. 143890) is one of the most frequent inherited disorders in humans with a frequency of 1/500 for heterozygotes in Western populations. It is characterized by a selective increase in low-density lipoprotein (LDL) particles in the plasma leading to tendon and skin xanthomas, arcus senilis corneae, and premature morbidity and mortality from cardiovascular complications. This disease has proven to be genetically heterogeneous and is associated with defects in two different genes: LDLR (LDL receptor),1, 2 and APOB (apolipoprotein B-100).3 Our team pioneered the hypothesis that the disease was genetically more heterogeneous. We mapped a third major locus at 1p34.1–p324 and went on to show that it encodes PCSK9 (proprotein convertase subtilisin/kexin type 9).5 Altogether, we identified seven mutations in eight families (5, 6, JPR personal data) and showed that mutations in the PCSK9 gene account for only 1.8% of gene defects in a sample of 392 ADH families studied in our laboratory. Subsequently, Cohen et al.7 showed that mutations within the gene are also involved with a dominant form of hypocholesterolemia. Targets of PCSK9 convertase are still unknown and its biological function is still unclear, although it is now known to enhance the degradation of the LDL receptor protein.8

While investigating ADH families, we identified a group of families neither harboring mutations within the three major genes nor linked to any of these genes. We now report one of these families, a large French pedigree, in which we postulated the involvement of a fourth gene, which we named HCHOLA4. The objective of this study was to localize and possibly identify ADH disease loci additional to the three major loci known.

Materials and methods

Family recruitment and disease ascertainment

Hypercholesterolemic families were recruited by the French National Research Network on Hypercholesterolemia. For probands, the following selection criteria were used: total cholesterol and apoB values above the 95th percentile when compared with a sex- and age-matched French population (personal data from Institut Régional pour la Santé, Tours-La Riche, France), triglycerides below 1.5 mmol/l, personal and/or documented familial xanthomas, xanthelasmas, and/or arcus senilis corneae, and early coronary artery disease. Lipid measurements were repeated to ascertain the existence of primary type IIa hyperlipoproteinemia. Families were investigated at large to confirm the presence of ADH. Blood samples were obtained from the index patient and his/her relatives for isolation of DNA. The study was conducted in accordance with French Bioethic regulations, and written informed consent was given by all subjects, and when appropriate, by parents or guardians.

Genome-wide scan and linkage analysis

A genome-wide scan was performed using 232 polymorphic microsatellite markers spaced at 20-cM intervals on autosomes.9 All marker-typing data were collected blindly and independently by four of us. A second genome-wide scan was performed using a GeneChip 10K Array (Affymetrix chips, GeneChip Human Mapping 10K Array Xba 142, Affymetrix, Santa Clara, CA, USA) in collaboration with the ‘Centre National de Génotypage’ (Evry, France). We used parametric linkage in two situations: the first, an ‘affected-only’ analysis in which all related family members were assigned either an affected or an unknown disease status; the second, a ‘whole-family’ analysis in which all related family members were assigned an affected, unaffected, or unknown status as provided by referring clinicians and in agreement with clinical and biological data. Parametric linkage analyses were performed with the following parameters: (1) autosomal dominant transmission of the disease trait, (2) reduced penetrance of 0.9 for heterozygotes (or complete penetrance for the affected-only analyses), and (3) frequency of the disease allele of 0.004% (close to the frequency of PCSK9 mutants). The power of the families for linkage was evaluated using the FastSlink v2.51 software.10 We used Pedcheck11 to detect Mendelian inheritance errors. SuperLink v1.5 and SimWalk v2.91 softwares were used to compute two-point and multipoint LOD scores.12, 13 All these softwares were run using the easyLinkage Plus v5.00 package.14 Additional model-free analyses were realized with the GENEHUNTER software, which performs a complete multipoint to infer the degree of identity-by-descent sharing among all affected family members at each map point.15 We used GENEHUNTER-PLUS, a modified version of this software, that has been shown to be less conservative, particularly when data are less than perfectly informative.16 This program calculates a semi-parametric LOD score (ghp-LS) by using a single parameter that is a measure of the inheritance vector in pedigree and allele sharing.

Candidate genes investigation

DNA was extracted from the venous blood using a technique described by Collod et al.17 A selection of 57 regional candidate genes was sequenced (coding regions and close flanking intronic parts) as reported previously.5 Primer sequences are available on request.

LCAT activity measurement

Lecithin cholesterol acyltransferase (LCAT) activity in the plasma was measured by a nonradioactive endogenous cholesterol esterification method.18 Briefly, it consists of measuring the decrease in plasma-free cholesterol content after incubation at 37°C.

Results

Identification of a non-LDLR/non-APOB/non-PCSK9 ADH family

Among 69 consecutive non-LDLR/non-APOB/non-PCSK9 families identified in our laboratory, 1 family was large enough to enable genome-wide mapping. The recruitment of the HC6 family was initiated through the female proband II-4 who presented a total cholesterol value of 3.64 g/l (9.39 mmol/l) at 54 years of age without medication. We enlarged the family to 30 individuals over 4 generations, thus exploring 23 meioses (Table 1 and Figure 1). Several cardiovascular accidents were reported in the previous generations. Clinical examination of affected members revealed no xanthoma, xanthelasma, or arcus senilis corneae. Carotid intimal–medial thickness was elevated for patient II-1, whereas it was normal for his sister II-4 (data not shown). In all, 10 family members presented elevated blood cholesterol with isolated hyper-LDLemia; 13 family members and 5 spouses showed no clinical or biological anomaly and were considered nonaffected. Finally, a father II-9 and son (III-14) were scored as ‘unknown’ for linkage analysis because of high triglyceride levels (2.53 mmol/l) in the father. Distributions of total and LDL cholesterol values in all tested family members were bimodal, thus compatible with an autosomal dominant transmission of the disease (data not shown).

Table 1 Lipid values for subjects of the HC6 family
Figure 1
figure 1

Regional haplotype of the HCHOLA4 locus in the HC6 family. Blackened symbols represent affected subjects, white symbols represent unaffected subjects, whereas individuals with undetermined status are in gray.

The candidate gene approach excluded genetic linkage between the ADH phenotype and any of the three ADH-causing genes identified so far. Indeed, regional haplotypes constructed in the HC6 family for polymorphic markers covering LDLR (19p13.2), APOB (2p24.1), and PCSK9 (1p32.3) genes showed no cosegregation of a particular allelic association with the ADH phenotype. Furthermore, with a ‘whole-family’ linkage analysis, exclusion LOD score values of −4.95, −3.47, and −2.49 were obtained for LDLR, APOB, and PCSK9 genes, respectively. Finally, no deleterious mutation was observed after sequencing LDLR, APOB, and PCSK9 genes, and no major rearrangement was found in the LDLR gene (data not shown).

Linkage analysis

A total of 100 000 simulations were performed to evaluate the power and relevance of a genome-wide linkage scan in the HC6 family. In the ‘affected-only’ analysis, the average expected LOD score was 1.60 and the maximum expected LOD score was 2.41. In the ‘whole-family’ analysis, the average expected LOD score was 3.18 and the maximum expected LOD score was 4.81, indicating that the statistically significant threshold of 3 could be reached with this single family with 51.3% of maximum LOD scores >3.3 (empirical P=0.010), and for thresholds of 3.7 or 3.9, empirical P=0.001. Among the 232 microsatellite markers initially tested in the ‘whole-family’ analysis, 7 had LOD scores above 1.2: markers D8S263 (LOD=1.41, θ=0), D10S186 (LOD=1.29, θ=0), D12S310 (LOD=1.43, θ=0.15), D12S87 (LOD=1.32, θ=0.2), D16S3107 (LOD=3.77, θ=0), D16S515 (LOD=1.36, θ=0), and D20S200 (LOD=1.38, θ=0.15).

To confirm and further investigate possible linkage in these regions, 500 SNPs and 12 additional microsatellite markers were used. Convincing exclusion data were obtained for chromosomes 8, 10, 12, and 20. At 16q22, 7 additional microsatellite markers and 100 SNPs of the 16q22 region were tested to saturate and delimit a disease interval. Results of this analysis showed high LOD scores for nine of the new markers (Table 2 and Supplementary Table 1). In the ‘affected-only’ analysis, five markers, namely D16S3031, rs725131, D16S3107, D16S3067, and D16S3018, had LODθ=0 scores of 2.11, 1.85, 2.41, 2.41, and 1.98 respectively (Supplementary Table 1). In the ‘whole-family’ analysis, LODθ=0 scores for rs725131, D16S3107, D16S3067, and D16S3083 increased to 3.36, 3.77, 3.91, and 2.41, respectively, whereas the maximum LOD score for D16S3031 was reduced to 1.46 at θ=0.10 (Table 2).

Table 2 Two-point parametric linkage analysis in the whole HC6 family

To investigate whether 16q22.1 linkage could result from the use of an inaccurate model, we performed model-free analyses with GENHUNTER-PLUS (from marker D16S3043 to D16S518) in affected individuals of the HC6 family. A semi-parametric LOD score (ghp-LS) was calculated. The ghp-LS was null at markers D16S3043 and D16S518, thus excluding linkage to these two markers. The ghp-LS was positive and maximal (1.63) from markers D16S3031 to D16S515 (data not shown).

Regional haplotype construction allowed the identification of a common region for all affected members from D16S3043 to D16S518 (Figure 1). Indeed, recombinational events were observed in patient II-5 for the proximal border (between D16S3043 and D16S3031) and in patient II-1 for the distal border (between rs254770 and D16S518). Nonaffected recombinants allowed reduction to a minimum interval of 8.39 Mb between markers D16S3031 and D16S3018 (Figure 1). Indeed, recombinational events were found in members II-11 and III-4 for the proximal border (between D16S3031 and rs725131) and in member III-10 for the distal border (between D16S3067 and D16S3018). Subject III-11 (nonaffected) also inherited the disease-associated haplotype and thus highly probably displays incomplete penetrance of the HCHOLA4 disease. Phenotypically unclassified patients (II-9 and III-14) did not carry the disease haplotype. The regional multipoint LOD score analysis resulted in a maximum value of 3.81 and allowed the identification of a minimum interval of 7.89 Mb between 64 687 100 and 72 127 100 bp at 16q22 (Figure 2).

Figure 2
figure 2

Multipoint LOD scores of the HCHOLA4 locus. The parameters used for this analysis were dominant inheritance, incomplete penetrance (0.9), and frequency of the disease-causing allele at 0.004% (SimWalk v2.91).

To reduce the linked region, we genotyped polymorphic markers of the HCHOLA4 locus in 18 ADH families that showed exclusion of LDLR, APOB, and PCSK9 genes, as well as the absence of mutation within these genes (Table 3). These 18 families represented a total of 75 meioses with 74 affected and 49 nonaffected subjects. Three families (namely HC32, HC122, and HC218) were noninformative. Nine families (namely HC14, HC35, HC42, HC73, HC126, HC138, HC257, HC374, and HC438) showed exclusion of the locus. The six remaining families (namely HC38, HC49, HC120, HC136, HC205, and HC266) did not exclude linkage to the HCHOLA4 locus but were not large enough to provide significant linkage. Furthermore, none of them showed recombinational events in the disease interval. In effect, the recombinational event found in family HC136 occurred between markers D16S3031 and D16S3019 (position 64 686 681 bp) (data not shown), and thus was distal to the regional boundary obtained with subject III-4 from the HC6 family, that occurred between markers D16S3031 and rs725131 (position 65 143 211 bp) (Table 2 and Figure 1).

Table 3 SimWalk multipoint parametric linkage analysis of the HCHOLA4 locus in 18 families

Analysis of regional candidate genes

An inventory of all genes and ORFs present between these two positions was drawn up using data from the UCSC Genome Browser (http://www.genome.ucsc.edu/cgi-bin/hgGateway), the Genatlas database (http://www.dsi.univ-paris5.fr/genatlas/), and the Ensembl Database (http://www.ebi.ac.uk/ensembl/). This study showed that the disease interval contains 154 genes (Supplementary Table 2). The first gene investigated was the LCAT gene as it encodes a major actor of cholesterol metabolism. No mutation was identified in affected members of the HC6 family after sequencing of the 6 exons and intronic flanking regions of the gene. Furthermore, LCAT enzymatic activity was tested in two affected members of the HC6 family (subjects II-4 and III-5) and was not significantly different from that of controls (82.30±5.59 versus 70.00±3.54 nmol cholesterol ester per h/ml of plasma for controls, P=0.07).

Several genes within the locus were good candidates, and we defined a priority for analysis according to the function of the encoded proteins and their tissue expression. We tested the coding sequence of 57 candidate genes by direct sequencing (Supplementary Table 2). No deleterious variation was identified. However, this systematic sequencing allowed the identification of new polymorphisms that were used to comfort regional haplotypes, and to reduce the common interval. We identified one polymorphism in C16orf48 (c.780A>G). Its frequency was estimated at 0.49 by direct sequencing of 60 healthy individuals. All carriers of the affected haplotype in the HC6 family shared this polymorphism. A two-point linkage analysis resulted in a LOD score of 3.52 (θ=0). Interestingly, this gene is located close to marker D16S3067 for which we obtained the highest regional LOD score.

Discussion

In this study, a genome-wide scan performed in a single large French family mapped a fourth ADH gene (different from LDLR, APOB, and PCSK9) at 16q22.1. The highest LOD scores of 3.77 (θ=0) and 3.36 (θ=0) were obtained for markers D16S3107 and rs725131, respectively, with the ‘whole-family’ analysis to maximize the number of investigated meioses. Mapping was confirmed by a multipoint analysis that produced significant LOD scores as well. Furthermore, no other region of the genome showed LOD score values as important as these. Haplotype construction in the HC6 family allowed delineation of a critical region between markers D16S3043 and D16S518 considering only affected members, and between markers D16S3031 and D16S30106 when unaffected individuals were also taken into account. Subject III-11, who carries the disease-associated haplotype but presents a normal phenotype, may illustrate the well-documented incomplete penetrance of ADH. Indeed, there is an extreme variability of the hypercholesterolemic phenotype with well-documented cases of incomplete penetrance probably due to the effect of modifier factors. For example, Sass et al.19 reported a French Canadian family with a 5 kb deletion in the LDLR gene with several carriers of the mutation presenting normal cholesterol values. In the same manner, Hobbs et al.20 described a large Puerto Rican family carrying the p.Ser156Leu (now p.Ser177Leu according to the international nomenclature) mutation of the LDLR gene. In this family, penetrance of the disease could be estimated at 0.68 (13/19), as six subjects were nonmanifesting carriers of the mutation. In our first report of the p.Ser127Arg mutation in the PCSK9 gene as well, one carrier of the mutation displayed normal cholesterol levels contrary to the other family members carriers of the disease-causing mutation who were hypercholesterolemic.5

To reduce the linked region, 18 smaller ADH families that showed exclusion of the involvement of LDLR, APOB, and PCSK9 genes were studied. Only six families did not exclude linkage to the HCHOLA4 locus but were not large enough to independently provide significant linkage. Furthermore, no recombinational event was detected that would have allowed the reduction of the disease interval. Recruitment of new family members and new families is now necessary to replicate significant linkage and to define a smaller interval.

None of the recent genome-wide association or linkage studies21, 22, 23, 24, 25, 26 identified any loci fulfilling our two search criteria: (1) affecting LDL (only lipoprotein elevated in the HCHOLA4 families) and (2) localized within positions 64 687 100–72 127 000 bp at 16q22. Three genome-wide association studies established an association between loci localized within the HCHOLA4 locus at 16q22.1 reported in this study, and variation in HDL concentrations.21, 22, 23 These loci are generally considered to reflect association of HDL levels with the LCAT gene. Mutations in the LCAT gene are associated with Fish Eye disease (OMIM no. 136120). Fish Eye disease is characterized by normal serum cholesterol and increased serum triglycerides, VLDL, and LDL triglycerides, whereas HDL cholesterol is reduced. Although Fish Eye disease and ADH present different phenotypes, we explored the LCAT gene as a HCHOLA4 candidate because of its involvement in cholesterol metabolism. Unsurprisingly, no mutation was found in the affected subjects of the HC6 family. Furthermore, LCAT enzymatic activity of two affected family members was not significantly different from that of controls. Thus, involvement of the LCAT gene was clearly excluded.

Our results reveal the existence of a new partner of either LDLR-mediated endocytosis or intracellular trafficking, or even a new and different endocytic pathway. Better knowledge has been gained in recent years on the molecular basis of transmembrane traffic of cargo proteins. The clathrin-dependent pathway, involved in the endocytosis of receptor-bound LDL particles, is now known to involve adaptors (heterotetrameric protein complexes or AP2 complexes) that are responsible for cargo sorting.27 However, other traffic pathways have been defined either to be associated with coated vesicles (COPI and COPII) or caveolae.28 The protein encoded by the recently cloned ARH gene (MIM no. 605747) is also noteworthy. Its protein product known as LDLR adaptor protein 1 contains a phosphotyrosine-binding domain that is also found in adaptor proteins and that could bind the cytoplasmic NPXY motif of the LDLR. Mutations of the ARH gene are found in patients presenting with the autosomal recessive form of familial hypercholesterolemia.29 In this context, several genes at 16q22.1 were also excellent functional candidates. Among these, the following five genes were studied first:

  • The AP1G1 gene encodes the adaptor-related protein complex 1, γ-1 that belongs to the adaptor complexes large subunit family, involved in vesicle transport from the trans-Golgi network to the lysosomes.30

  • The VPS4A gene encodes an AAAtype ATPase belonging to the family of vacuolar sorting proteins originally identified in yeast. Genetic studies led to the identification of a subset of yeast ‘vps’ mutants that accumulate an exaggerated late endosome known as the ‘class E’ compartment. These mutants were found to be defective in multivesicular body formation.31

  • The RRAD gene encodes a ras-related GTPase already involved in diabetes and that might be involved in LDLR endocytosis as GTPase Rho has a role in the cellular uptake of LDL by human skin fibroblasts.32

  • The ATP6V0D1 gene (vesicular ATPase), encodes a vacuolar proton-ATPase and is possibly involved in the transport from late endosome to lysosomes of membrane proteins.33

  • The FLJ12076 gene encodes a homolog of LIN-10 involved in EGF receptor (LET-23) membrane localization in Caenorhabditis elegans.34 LIN-10 protein contains a PDZ domain known to interact with the cytoplasmic tails of membrane proteins.35

No mutation was found in any of these genes or among the 52 other regional genes that we studied. However, this does not totally exclude the possibility that one may very well be the causative gene as only exons and their flanking intronic regions were sequenced.

This study provides evidence of the existence of a new gene involved in the pathogenesis of ADH. The existence of a greater level of genetic heterogeneity is in agreement with recent reports as the proportion of ADH subjects without an identified mutation ranges from 12 to 72% depending on the study.36 Such a large difference in mutation identification is probably due to different sample sizes, heterogeneous clinical definitions, and screening protocols. Overall, the best estimated proportion of individuals without a mutation in any of the three identified ADH genes is 15.25%.36 This group of new forms of ADH is very probably itself heterogeneous, and the proportion of HCHOLA4-affected individuals may not be more important than that of PCSK9 carriers that we estimated at 1.5%.36 HCHOLA4-linked ADH may thus be considered as a very rare form of ADH. Whatever its function, the HCHOLA4 protein may be a player in the pathway that involves the convertase PCSK9, and thus is also involved in the intracellular trafficking of the LDLR. If so, its identification would help to elucidate the pathophysiology of both PCSK9-linked and HCHOLA4-linked ADH. Furthermore, as a novel protein implicated in the regulation of cholesterol metabolism, HCHOLA4 might constitute a new target for hypocholesterolemic treatment.

In conclusion, we report the detection and chromosomal localization of the fourth gene involved in ADH at 16q22.1. Recruitment of new families is now required to define a smaller linked region and to focus research on a reduced number of genes. Finally, our results also show the existence of other ADH genes through the identification of nine families unlinked to LDLR, APOB, PCSK9, or to the new HCHOLA4 gene.