Article Text

Download PDFPDF

Original article
Genetic characteristics of retinitis pigmentosa in 1204 Japanese patients
  1. Yoshito Koyanagi1,2,
  2. Masato Akiyama1,2,
  3. Koji M Nishiguchi3,4,
  4. Yukihide Momozawa5,
  5. Yoichiro Kamatani1,6,
  6. Sadaaki Takata5,
  7. Chihiro Inai5,
  8. Yusuke Iwasaki5,
  9. Mikako Kumano2,
  10. Yusuke Murakami2,
  11. Kazuko Omodaka3,
  12. Toshiaki Abe7,
  13. Shiori Komori8,
  14. Dan Gao9,
  15. Toshiaki Hirakata9,
  16. Kentaro Kurata10,
  17. Katsuhiro Hosono10,
  18. Shinji Ueno8,
  19. Yoshihiro Hotta10,
  20. Akira Murakami9,
  21. Hiroko Terasaki8,
  22. Yuko Wada11,
  23. Toru Nakazawa3,4,
  24. Tatsuro Ishibashi2,
  25. Yasuhiro Ikeda2,
  26. Michiaki Kubo12,
  27. Koh-Hei Sonoda2
  1. 1 Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
  2. 2 Department of Ophthalmology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
  3. 3 Department of Ophthalmology, Tohoku University Graduate School of Medicine, Sendai, Japan
  4. 4 Department of Advanced Ophthalmic Medicine, Tohoku University Graduate School of Medicine, Sendai, Japan
  5. 5 Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
  6. 6 Kyoto-McGill International Collaborative School in Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
  7. 7 Division of Clinical Cell Therapy, United Centers for Advanced Research and Translational Medicine (ART), Tohoku University Graduate School of Medicine, Sendai, Japan
  8. 8 Department of Ophthalmology, Nagoya University Graduate School of Medicine, Nagoya, Japan
  9. 9 Department of Ophthalmology, Juntendo University Graduate School of Medicine, Tokyo, Japan
  10. 10 Department of Ophthalmology, Hamamatsu University School of Medicine, Shizuoka, Japan
  11. 11 Yuko Wada Eye Clinic, Sendai, Japan
  12. 12 RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
  1. Correspondence to Professor Koh-Hei Sonoda, Department of Ophthalmology, Graduate School of Medical Sciences, Kyushu University, Fukuoka 812-8582, Japan; sonodak{at}med.kyushu-u.ac.jp

Abstract

Background The genetic profile of retinitis pigmentosa (RP) in East Asian populations has not been well characterised. Therefore, we conducted a large-scale sequencing study to investigate the genes and variants causing RP in a Japanese population.

Methods A total of 1209 Japanese patients diagnosed with typical RP were enrolled. We performed deep resequencing of 83 known causative genes of RP using next-generation sequencing. We defined pathogenic variants as those that were putatively deleterious or registered as pathogenic in the Human Gene Mutation Database or ClinVar database and had a minor allele frequency in any ethnic population of ≤0.5% for recessive genes or ≤0.01% for dominant genes as determined using population-based databases.

Results We successfully sequenced 1204 patients with RP and determined 200 pathogenic variants in 38 genes as the cause of RP in 356 patients (29.6%). Variants in six genes (EYS, USH2A, RP1L1, RHO, RP1 and RPGR) caused RP in 65.4% (233/356) of those patients. Among autosomal recessive genes, two known founder variants in EYS [p.(Ser1653fs) and p.(Tyr2935*)] and four East Asian-specific variants [p.(Gly2752Arg) in USH2A, p.(Arg658*) in RP1L1, p.(Gly2186Glu) in EYS and p.(Ile535Asn) in PDE6B] and p.(Cys934Trp) in USH2A were found in ≥10 patients. Among autosomal dominant genes, four pathogenic variants [p.(Pro347Leu) in RHO, p.(Arg872fs) in RP1, p.(Arg41Trp) in CRX and p.(Gly381fs) in PRPF31] were found in ≥4 patients, while these variants were unreported or extremely rare in both East Asian and non-East Asian population-based databases.

Conclusions East Asian-specific variants in causative genes were the major causes of RP in the Japanese population.

  • retinitis pigmentosa
  • genetic epidemiology
  • next-generation sequencing

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Retinitis pigmentosa (RP, OMIM 26800) is the most common form of hereditary retinal degenerative disease worldwide. Vision loss in RP typically begins with night blindness and visual field constriction due to rod cell dysfunction and death, followed by a loss of daylight and central vision owing to cone cell loss.1–3 The prevalence of RP is approximately 1 in 3000–4000 people, with an estimated total of 2.5 million people affected worldwide.1 4 In Japan, RP is the second leading cause of blindness. In 2014, the Ministry of Health, Labour and Welfare of Japan reported that there were 29 330 patients in Japan.

RP is a group of Mendelian disorders that is usually inherited through autosomal dominant (AD), autosomal recessive (AR) or X linked (XL) modes.1 In recent years, the adaptation of high-throughput DNA sequencing technologies, including next-generation sequencing (NGS), has accelerated the identification of novel causative genes and variants of RP, mainly from European populations.1 4 Consequently, a genetic diagnosis could be made in 50%–66% of European patients with RP.1 5–8 However, the genetic profile of RP in East Asian populations has not been well characterised. In a previous study in which targeted resequencing was performed in 317 Japanese patients, the causative genes were determined for a much lower proportion of patients (36.3%) than in a European population.9 Although two founder variants in EYS [p.(Ser1653fs) and p.(Tyr2935*)] and frequent variants in CNGA1 have been reported in the Japanese population,10–12 no other population-specific causative variants have been elucidated. Considering that RP is a rare disease with dozens of causative genes, the small-scale studies that have been conducted so far might have been insufficient to detect other pathogenic variants frequently found in the Japanese population. Therefore, large-scale sequencing of Japanese patients with RP is warranted to reveal the genetic profile of RP in an East Asian population.

Here, we report deep resequencing (mean depth >1000) results of 83 RP causative genes with a high coverage rate (>99.5%) in 1209 patients, who account for 4.1% of all Japanese patients with RP. By taking advantage of the largest number of patients ever analysed, we attempted to uncover the profile of causative RP genes and variants in the Japanese population.

Methods

Study patients

We collected DNA samples from 1399 patients with RP or an allied disease from five facilities, namely Kyushu University Hospital (n=611), Tohoku University Hospital and Yuko Wada Eye Clinic (n=582), Nagoya University Hospital (n=115), Juntendo University Hospital (n=57), and Hamamatsu University Hospital (n=34), in 2002–2017. The clinical diagnosis was based on a patient’s history of night blindness, visual field constriction and/or ring scotoma, and severe rod-cone dysfunction or non-recordable responses extracted by an electroretinogram, in addition to ophthalmoscopic findings (eg, bone spicule-like pigment clumping in the mid-peripheral and peripheral retina and attenuation of retinal vessels) made by trained ophthalmologists. We excluded patients with syndromic RP, cone-rod or cone dystrophy, Bietti crystalline retinopathy, uveitis, choroideraemia, Leber congenital amaurosis, or retinitis punctata albescens from the study (n=170). To reduce the effect of relatives, we also excluded patients other than the proband of each pedigree (n=20). Finally, we enrolled 1209 typical RP probands.

Gene selection

We selected all the causative genes of non-syndromic RP registered in Retinal Information Network (RetNet) (https://sph.uth.edu/retnet/) as of 19 September 2017 (n=83; online supplementary table S1).

Supplemental material

We targeted the coding regions of transcripts annotated by RefSeq13 for 2 genes (KIZ and NR2E3) and the consensus coding sequence (CCDS)14 for 81 other genes, as well as 5 bp into each exon-adjacent intron, to capture variation at splice-donor/acceptor sites, resulting in the design of 1178 target regions encompassing 219 292 bp.

Multiplex PCR-based target sequencing

Multiplex PCR-based target sequencing was performed as described previously.15 Briefly, we designed 2529 primer pairs using the Primer3 software (V.2.3.4).16 Primer information is shown in online supplementary table S2. Multiplex PCR reactions were performed with the GeneAmp PCR System 9700 (Life Technologies, Carlsbad, California). To cover all coding regions of the 83 genes (online supplementary table S1), we conducted multiplex PCR using 15 different primer pools. Next, a second PCR was performed to add dual barcodes to the first PCR products. To eliminate primer dimers, we purified all second PCR products using Agencourt AMPure XP reagent (Beckman Coulter, Brea, California). Then, these products were applied to a bioanalyser (Agilent Technologies, Santa Clara, California) to examine the distribution of product sizes and quantified using the KAPA Library Quantification Kit (Roche, Basel, Switzerland) and an ABI PRISM 7900HT sequence detection system (Thermo Fisher Scientific, Waltham, Massachusetts). Finally, we sequenced the second PCR products using a HiSeq 2500 instrument (Illumina, San Diego, California) and obtained 2×151 bp paired-end reads with dual 8 bp barcode sequences.

Supplemental material

Analysis of sequencing data

Sequencing reads were demultiplexed using bcl2fastq2 V.2.18 (Illumina) and then aligned to the human reference sequence (hg19) using Burrows-Wheeler Aligner (V.0.7.12)17 after trimming the primer sequence using cutadapt (V.1.10). Subsequently, we applied them to IndelRealigner and RealignerTargetCreator implemented by the Genome Analysis Toolkit (GATK) (V.3.7).18 The variants of each individual were separately analysed using HaplotypeCaller and UnifiedGenotyper implemented by GATK. Next, we integrated variants detected by either method. The alternative allele frequencies for each variant were calculated using SAMtools (V.1.5),19 and histograms of their frequencies were generated in all individuals for each variant. The variants were selected when they obviously showed three peaks corresponding to three genotypes or two peaks if they were considered to be rare or low-frequency variants by visual inspection. For each individual, the genotypes were determined based on the alternative allele frequency using the following thresholds. When an alternative allele frequency between 0 and 0.15 was observed, the samples were assigned as a homozygote of the reference allele. Likewise, when alternative allele frequencies between 0.25 and 0.75 and between 0.85 and 1 were observed, we assigned the samples as a heterozygote and a homozygote of the alternative alleles, respectively. We regarded variants as ‘missing’ if the frequency was outside of these ranges or the variant position was covered with <20 reads. Variants detected outside of the region covered by each primer pair or having a low call rate (<98%) were excluded. The identified variants were annotated using ANNOVAR20 (V.3.4) and SnpEff21 (V.4.3), with the transcripts registered in CCDS Release 1514 and refGene.13 The covered region was defined as that with bases having ≥20 sequencing reads in the target region. The coverage per base was calculated as the number of covered samples divided by the number of all samples. The cover rate per gene (%) was defined as the average coverage per base of the target region. After excluding five samples with low coverage (<98.0%), we used 1204 samples that included ≥98.0% of the targeted regions for further analysis (online supplementary figure S1).

Supplemental material

Definition of a pathogenic variant

First, the identified variants were divided into four classes. Class 1 comprised variants previously reported as pathogenic in the Human Gene Mutation Database (HGMD) Professional (2017.2)22 (variants classified as ‘DM’ and ‘DM?’) or ClinVar database (accessed on 1 October 2017)23 (variants with ClinVar star 2+ and star 1). Class 2 included putative deleterious variants (nonsense, frameshift or canonical splice site variants) that were not previously reported. Class 3 comprised missense variants, non-frameshift deletions, insertions and non-canonical splice site variants that were not previously reported. All other types of variants were categorised as Class 4. We considered variants belonging to Class 1 and Class 2 to be possibly pathogenic variants. Next, considering that RP is a rare Mendelian disease, we excluded variants having a minor allele frequency (MAF) of >0.5% for recessive genes and >0.01% for dominant genes in any ethnicity in the 1000 Genomes Project,24 the National Heart Lung and Blood Institute (NHLBI) "Grand Opportunity" (GO) Exome Sequencing Project, Exome Aggregation Consortium,25 Human Genetic Variation Database26 and Genome Aggregation Database (gnomAD).25 Subsequently, we performed a visual inspection of pathogenic variant sites using the Integrative Genomics Viewer (IGV).27 If there were ≥2 alternative alleles on the same codon of the pathogenic variants (n=7) or a single-nucleotide variant with a deletion existed in the same variant site (n=1), we manually corrected the annotation of the variants.

Criteria for genetic diagnosis

We performed a genetic diagnosis based on the detected pathogenic variants regarded as Class 1 (variants previously reported as pathogenic) and Class 2 (novel putative deleterious variants) after the filtering procedure by MAF in the population databases. The genetic inheritance mode defined in RetNet for each gene was applied. The criteria for genetic diagnosis were as follows: (1) When one pathogenic variant was present in an autosomal dominant RP (ADRP) gene, the variant was considered to be the cause of ADRP. (2) When two pathogenic variants were present in an autosomal recessive RP (ARRP) gene in either a homozygous or a compound heterozygous state, the variants were judged to be the cause of ARRP. In case two homozygous pathogenic variants in the same gene were found in a particular patient, these two pathogenic variants were considered to be in cis, and therefore such combinations of two heterozygous variants were not considered as compound heterozygote state when found in another patient. (3) If one pathogenic variant was present in an XL recessive RP gene, the patient was judged to have XL recessive RP if the patient was male. For the six genes (BEST1, NR2E3, NRL, RHO, RP1 and RPE65) that have been previously reported to be associated with both ADRP and ARRP, we determined the mode of genetic inheritance of each variant based on the published literature. If a gene has the other genetic inheritance form registered as other inherited retinal dystrophy (IRD) in RetNet, we also confirmed the genetic inheritance form of each variant based on the published literature. In the genes which were registered as exhibiting both the AR and AD inheritance patterns as RP or any IRD, we regarded the variants as the cause of ARRP only when the novel variant was detected in a homozygous or compound heterozygous state. We did not regard the novel variants in these genes as the cause of ADRP. Pathogenic variants judged to be the cause of ARRP could not be the cause of ADRP when found heterozygously in patients with RP. If a patient had multiple pathogenic variants leading to a genetic diagnosis, we preferentially selected Class 1 variants and/or the variants that matched the clinical and genetic inheritance mode. If two causative variants belonged to the same class, we selected the variants that were categorised as ‘DM’ in HGMD or ‘star 2+’ in ClinVar. Finally, we regarded patients whose causative genes were determined as ‘solved’ and patients whose causative genes could not be determined as ‘unsolved’.

Variant validation

To evaluate the accuracy of NGS, we performed Sanger sequencing for the 261 pathogenic variants found in solved samples from Kyushu University (n=151). We successfully sequenced all targeted variants. The concordance rate was 99.6% (260/261). The heterozygous variant (c.2T>C in PRCD) in OPH-442 could not be confirmed by Sanger sequencing (online supplementary figure S2), although we confirmed that there are 379 reads (29.7%) containing the C allele of the 1276 sequence reads at this site in OPH-442.

Supplemental material

Statistical analysis

To compare the proportion of subjects between groups, we applied the χ2 test. We used the one-sided binomial test to evaluate the enrichment of the pathogenic variants and their carriers in the unsolved patients with RP by comparison with the expected values from a population database. For each AR gene, the frequency of carriers with a pathogenic variant of the gene among the unsolved patients was compared with that among the East Asian samples from the 1000 Genomes Project phase 3 (n=504). The allele frequencies in unsolved patients were compared with the highest frequency among East Asian samples from the 1000 Genomes Project phase 3, Exome Aggregation Consortium, Genome Aggregation Database and Human Genetic Variation Database. We used R (V.3.2.0) for general statistical analyses. A p value of <0.05 was considered significant. If multiple tests were performed, a Bonferroni correction was applied to control false positives.

Predict splice site effect

Furthermore, we analysed the previously unreported non-canonical intronic splice site variants in exon boundaries ±3–5 using splice site prediction tools. To evaluate the pathogenicity of these variants, we annotated these variants using databases (HGMD and ClinVar), allele frequency information of the population databases, and calculated the prediction score using MAXENT-Scan28 and Human Splicing Finder (HSF).29 Variants are considered as probably pathogenic in the case of a decrease or increase of >10% of the splice prediction score for HSF and >30% for MAXENT-Scan. We also evaluated the previously unreported coding variants at the positions 1–3 bp from each exon boundary in the same manner.

Assessment of other IRD genes

Considering that major causative genes of other IRD have a potential to increase the number of the genetically solved cases, we additionally performed a target resequencing of the three important causative genes of other IRD (CEP290, IMPG1, CHM and deep intronic variants c.2991+1655 A>G in CEP290) (online supplementary table S3). We designed a total of 134 primer pairs (online supplementary table S4).

Supplemental material

Supplemental material

LOVD submission

We have uploaded variants reported in this study in the Leiden Open Variation Database (LOVD).30

Results

Sequencing of 83 RP genes in 1209 patients with RP

We enrolled 1209 RP probands derived from 1209 pedigrees. The clinical characteristics of the patients are summarised in table 1. The modes of inheritance as obtained from clinical records were as follows: AD in 174 (14.4%) patients, AR in 250 (20.7%), XL in 18 (1.5%) and sporadic in 767 (63.4%) (table 1). We successfully sequenced the targeted regions of 83 genes in 1204 patients (99.6%) and obtained 0.79 tera base pairs (Tbp) of sequence data. The average read depth (±SD) was 1185 (±336) per sample. We called variants in the regions with ≥20 read depth, which satisfied 99.7% of the targeted regions. Seventy-seven of 83 genes (92.8%) had ≥99.5% coverage, indicating that most of the genes were well covered (online supplementary table S1). We note that 1106 (91.9%) and 1155 (95.9%) samples were sequenced with 100% coverage in all coding regions of EYS and USH2A, respectively, which were previously reported as major causes of RP in the Japanese population.9

Table 1

Characteristics of the 1209 patients with retinitis pigmentosa

Variants defined as pathogenic

We identified a total of 2445 variants, including 1430 missense variants, 92 nonsense variants, 93 frameshift indels, 26 canonical splice site variants, 57 non-canonical splice site variants, 31 non-frameshift indels and 683 synonymous variants. We initially classified 299 variants as Class 1, 141 variants as Class 2, 1291 variants as Class 3, and 714 variants as Class 4 (online supplementary table S5). We further considered Class 1 variants that were registered as pathogenic in the HGMD or ClinVar databases and Class 2 variants that were putatively deleterious. We excluded 80 of 299 (26.8%) Class 1 variants and 5 of 141 (3.55%) Class 2 variants because these variants had an MAF of >0.5% for recessive genes and >0.01% for dominant genes in any ethnicity in population-based databases (online supplementary table S5). The 11 of 80 (13.8%) excluded Class 1 variants had an MAF of >5% in any subpopulation in the databases, suggesting that these variants had very low penetrance or were not truly causative of RP. We manually corrected the annotations of eight variants, and the annotations of three variants were reassigned as putatively deleterious (online supplementary figure 3 and online supplementary table S6). Finally, we defined the remaining 350 variants as pathogenic in this study (table 2 and online supplementary table S7). Among the 194 putatively deleterious variants defined as pathogenic, 132 (68.0%) had not previously been reported in HGMD or ClinVar as being causative of RP. One previously reported synonymous variant [p.(Ser344Ser) in PDE6A] affecting splicing was defined as pathogenic.31

Supplemental material

Supplemental material

Supplemental material

Supplemental material

Table 2

Variants defined as pathogenic

Genetic diagnosis with the detected pathogenic variants

Among the 83 genes investigated, we detected 350 pathogenic variants in 57 (68.7%) genes, whereas no pathogenic variants were found in 26 (31.3%) genes. Of the 57 genes containing pathogenic variants, 38 genes were determined as the cause of RP in our samples (online supplementary table S1). Conversely, the remaining 19 genes were not considered to be the cause of RP, although pathogenic variants were detected. We determined 200 variants in 38 genes as the cause of RP in 356 patients (29.6%) (online supplementary tables S1 and S8). This proportion was similar to that in a previous report from Japan (36.3%).9 The proportions of solved patients among the clinical inheritance modes based on the family history were as follows: 43.0% AD, 39.0% AR, 23.0% sporadic and 50.0% XL.

Pathogenic variants in six genes (EYS, USH2A, RP1L1, RHO, RP1 and RPGR) explained 65.4% (233/356) of solved patients (figure 1A), indicating that these are major causative genes of RP in the Japanese population. EYS (n=110) accounted for nearly half of individuals who were solved as AR (44.9%), followed by USH2A (n=46, 18.8%), RP1L1 (n=17, 6.9%) and PDE6B (n=15, 6.1%) (figure 1B). Meanwhile, three genes (RHO, RP1 and PRPH2) explained more than half of solved AD patients (n=44/85, 51.8%). In X linked RP (XLRP), pathogenic variants in RPGR [including open reading frame 15 (ORF15) region] were found in the large majority of solved patients (n=23/26, 88.5%).

Figure 1

Proportion of causative genes in 356 solved patients with RP. (A) The proportion of causative genes in 356 solved patients with RP is shown. The blue area represents the proportion of AR genes, the red area represents that of AD genes, and the purple area represents that of X linked genes. (B) The proportion of causative genes based on the inheritance pattern is shown. The full list of genes is shown in online supplementary table S1. AD, autosomal dominant; AR, autosomal recessive; n, number of patients; RP, retinitis pigmentosa.

Next, we searched for variants that were frequently found in the genetically solved patients by assessing the allele counts (AC) of the 237 pathogenic variants identified in these patients. In the AR genes, a known founder variant [p.(Ser1653fs) in EYS] was the most frequent in the solved patients (AC=103), followed by another known founder variant [p.(Tyr2935*) in EYS; AC=47) (table 3A). In addition to these two known founder variants, we detected five pathogenic variants in AR genes found in more than 10 patients [p.(Gly2752Arg) and p.(Cys934Trp) in USH2A, p.(Arg658*) in RP1L1, p.(Gly2186Glu) in EYS, and p.(Ile535Asn) in PDE6B]. We note that 158 of 245 (64.5%) ARRP patients who were genetically solved carried at least one of these seven variants. Since these variants were not found or extremely rare (MAF <0.01%) in non-East Asian populations according to the databases with one exception [p.(Cys934Trp) in USH2A was 0.05% in Ashkenazi Jewish population of gnomAD], most of these variants were deemed to be specific to the East Asian population (table 3A). In AD genes, the most frequent variant was p.(Gly381fs) in PRPF31, which was found in six individuals. Four pathogenic variants in AD genes [p.(Pro347Leu) in RHO, p.(Arg872fs) in RP1, p.(Arg41Trp) in CRX and p.(Gly381fs) in PRPF31] were found in ≥4 patients, while these variants were unreported or extremely rare in both East Asian and non-East Asian populations (table 3B). These findings highlight that East Asian-specific pathogenic variants in causative genes largely influence patients with RP in the Japanese population.

Table 3

(A) Pathogenic variants in AR genes frequently found in 356 solved patients with RP and (B) pathogenic variants in AD genes frequently found in 356 solved patients with RP

Enrichment of pathogenic variants in AR genes among unsolved patients with RP

While we genetically solved 29.6% of all patients, the causative genes were yet to be determined in the remaining 70.4% of patients (n=848) (online supplementary table S8). To obtain a better understanding of the genetic basis of RP, we considered the impact of known RP genes in patients whose causative genes could not be determined. We found carriers of pathogenic variants in AR genes in 361 of 848 (42.6%) unsolved patients with RP.

Supplemental material

We additionally performed target resequencing of the three important causative genes of other IRD (CEP290, IMPG1, CHM and deep intronic variants c.2991+1655 A>G in CEP290) in 848 unsolved patients (online supplementary table S3). After excluding one sample (N-284) with low coverage (<98.0%), we included 847 patients with ≥98% coverage for further analysis. We identified a total of 90 variants, including 6 variants as Class 1, 5 variants as Class 2, 63 variants as Class 3, and 16 variants as Class 4 (online supplementary table S9). After judging whether the detected variants were putative deleterious or previously reported and excluding the variants unsatisfying the MAF cut-off threshold, eight variants were defined as pathogenic (online supplementary table S10). We determined the causative gene in 4 out of 847 unsolved patients (0.5%) (online supplementary table S11).

Supplemental material

Supplemental material

Supplemental material

To pinpoint genes that were enriched with pathogenic variants in the unsolved patients with RP, we statistically compared the frequencies of carriers with pathogenic variants in each AR gene with those in the 1000 Genomes Project. The frequencies in EYS (n=213, 25.1%), USH2A (n=74, 8.7%) and CRB1 (n=21, 2.5%) were significantly higher in the unsolved patients [p=1.20×10−122, p=1.11×10−4 and p=1.66×10−3, respectively; α<2.38×10−3 (=0.05/21)] (figure 2 and online supplementary table S12).

Supplemental material

Figure 2

Frequency of carriers with pathogenic variants in AR genes among 848 unsolved patients with RP. The frequency of carriers with pathogenic variants in AR genes among unsolved patients with RP and EAS samples from 1KG phase 3 (n=504) is shown. The enrichment was evaluated by the one-sided binomial test. Asterisks denote a significant enrichment of carriers in unsolved patients with RP evaluated after Bonferroni correction (p<2.38×10−3). The full list is shown in online supplementary table S12. 1KG, 1000 Genomes Project; AR, autosomal recessive; EAS, East Asian; RP, retinitis pigmentosa.

Next, to investigate the variants influencing genetically unsolved patients with RP, we also assessed whether the identified pathogenic variants were enriched in the unsolved patients as compared with the general population. Since the allele frequencies of 68 of 144 (47.2%) variants were not available in the population databases, we could test 76 pathogenic variants in the AR genes. We detected four variants as being significantly enriched in unsolved patients with RP, including two well-known founder variants in EYS [p.(Ser1653fs) and p.(Tyr2935*)], p.(Gly2186Glu) in EYS and one variant in MAK [p.(Ala114fs)], which was previously reported in only one Japanese patient as causing RP9 [p=6.91×10−86, p=2.14×10−12, p=2.06×10−17 and p=1.95×10−4, respectively; α<6.58×10−4 (=0.05/76)] (table 4 and online supplementary table S13).

Supplemental material

Table 4

Excess of pathogenic variants in AR genes in 848 unsolved patients with RP

To assess the possibility of variants affecting splicing in unsolved patients with one pathogenic variant in a known AR gene, we further analysed the previously unreported non-canonical intronic splice site variants in exon boundaries ±3–5 bp (n=52) using splice site prediction tools (online supplementary table S14). However, no variant satisfied the criteria (online supplementary table S14). We also evaluated the previously unreported coding variants at the positions 1–3 bp from each exon boundary (n=48) in the same manner. As a result, one heterozygous missense variant (c.820G>C in HGSNAT) in OPH-633 was regarded as probably pathogenic (online supplementary table S14). However, we could not genetically solve this case because OPH-633 had no other pathogenic variant (online supplementary table S8). Moreover, we re-evaluated low-coverage regions (<20 read depth) in unsolved patients with one pathogenic variant in a known AR gene. However, no pathogenic variants were detected in these regions.

Supplemental material

Discussion

In the present study, we performed deep target resequencing of the 83 known causative genes of RP in 1204 probands and determined 38 causative genes in 356 patients (29.6%). Pathogenic variants in six genes (EYS, USH2A, RP1L1, RHO, RP1 and RPGR) explained 65.4% of solved patients. We found that East Asian-specific pathogenic variants in causative genes were the major causes of RP in the Japanese population. Moreover, 42.6% of unsolved patients had a pathogenic variant in a known AR gene, and we identified pathogenic variants in three genes (EYS, USH2A and CRB1) and four variants [p.(Ser1653fs), p.(Tyr2935*) and p.(Gly2186Glu) in EYS and p.(Ala114fs) in MAK)] as being enriched in unsolved patients with RP.

Our study clarified the major causative genes of RP in the Japanese population. In a previous study in which the targeted resequencing of 365 genes was performed in 317 Japanese patients, 28 genes were identified as causative of RP.9 In the present study, 38 of 83 genes were used for the genetic diagnosis of RP. Of these 38 genes, 25 overlapped with the 28 genes used in the previous study.9 Therefore, we considered that these 38 genes were the major causative genes of RP in the Japanese population. Furthermore, among the 45 genes that were not used for genetic diagnosis in our study, 42 were also not used in the same previous study, indicating that more than half of those genes would have a small influence on RP in Japanese population.

Our findings suggest that population-specific frequent variants influence the difference in the proportion of causative genes across populations. In addition to the two known founder variants in EYS,10 11 we successfully identified five variants in four major AR genes (EYS, USH2A, RP1L1 and PDE6B) found in ≥10 solved patients. Most of these variants were unreported or extremely rare in non-East Asian populations but were found in an East Asian population (MAF >0.1%). Moreover, we could not find any RP-causing founder variants in the targeted genes that were reported from non-East Asian populations in our samples (online supplementary table S15). Considering that most of the major causative AR genes found in the Japanese population frequently contained these East Asian-specific variants, our results suggest that differences in the major causative genes among populations result from population-specific frequent variants.

Supplemental material

In this study, 848 patients (70.4%) remained genetically unsolved (online supplementary table S8). To further explain the genetic cause of these cases, we performed several additional analyses. However, analyses of variants affecting splicing and variants in low-coverage regions resulted in no additional solved cases, and additional target resequencing of the three important causative genes of other IRD resulted in four additional solved cases, which accounted for only 0.5% of unsolved cases. Therefore, these variants and genes did not largely increase the number of genetically solved cases.

We found that 42.6% of the unsolved patients had a pathogenic variant in a known AR gene. In particular, a significant enrichment of pathogenic variants in EYS was observed. More than a quarter (25.1%) of unsolved patients with RP had a pathogenic variant in EYS, compared with 3.2% of East Asian samples from the 1000 Genomes Project. To confirm that this finding had not resulted from insufficient sequencing, we compared the frequencies of carriers with pathogenic variants in EYS between the unsolved patients with 100% coverage of EYS (n=781) and those with <100% coverage (n=67). Those frequencies did not significantly differ (p=0.66), indicating that this result was not due to a technical problem. Similarly, in addition to the two major causative genes (EYS and USH2A), CRB1 and MAK were also associated with unsolved patients with RP. Our findings suggest that these genes and variants are potential genetic causes of RP in the Japanese population.

We considered four possible explanations for the significant enrichment of the pathogenic variants in genetically unsolved patients. First, we may have overlooked true pathogenic variants found by our sequencing because we defined pathogenic variants based only on the databases (HGMD and ClinVar) and putatively deleterious variants. Missense variants, which have not been reported to cause RP, were not defined as pathogenic. This would induce an underestimation of the number of solved patients. Second, in genetically unsolved patients with RP with two or more pathogenic variants in different genes, the disease could occur via an oligogenic inheritance pattern. Several variants, including a combination of variants in PRPH2 and ROM1, have been reported as causes of oligogenic RP.32 However, we could not find any patients with the previously reported combinations of variants. Third, the other pathogenic variant might be present in the same gene as a compound heterozygote that is difficult to identify via a short read sequence; for example, large structural variations (SVs) can be the cause of RP, as suggested previously.6 33 Fourth, certain variants in known AR genes can be the causes of ADRP, considering that several genes have been reported to be involved in both ARRP and ADRP. Taken together, future studies including familial analysis and the investigation of SVs should be performed to reveal novel causative variants of RP.

RP1L1 has been registered as an AR gene causing typical RP in RetNet. Additionally, this gene has been observed to show AD inheritance pattern in occult macular dystrophy (OMD) that is an inherited macular dystrophy characterised by progressive bilateral vision loss despite normal fundus appearance.34 It was unlikely that patients with OMD were included as the subjects of our study because the phenotype of OMD was clearly different from that of typical RP. Among the 43 patients carrying pathogenic variants of RP1L1 in this study, we found 18 unsolved cases with one pathogenic variant in RP1L1. We examined the possibility that the pathogenic variants in RP1L1 might be the causes of ADRP in the unsolved RP cases with one pathogenic variant in RP1L1. In 18 unsolved cases, we found a total of eight pathogenic variants (online supplementary tables S8 and S16). After the filtering procedure with a MAF of >0.01% that was required when these pathogenic variants were assumed to show the AD inheritance pattern, three variants [p.(Gln1321*), p.(Glu1308*) and p.(Gln9*) in RP1L1] remained. We show four patients carrying these three variants in online supplementary table S17. We reviewed familial history and the clinical inheritance modes based on the clinical records of these four patients. As a result, the clinical inheritance modes of J-17–12 and OPH-725 were regarded as sporadic, whereas the clinical inheritance modes of OPH-285 and OPH-634, who were carriers of p.(Gln9*) in RP1L1, were considered to be AD based on a clinical interview. The latter two cases with retinal degeneration were clearly not patients with OMD. Although the enrichment of pathogenic variants in RP1L1 among unsolved patients with RP was not significant (online supplementary tables S12 and S13), this result suggested that there remained a possibility of p.(Gln9*) in RP1L1 showing AD inheritance pattern in typical RP. Further investigations are required to reach a conclusion.

Supplemental material

There were two major strengths in the present study. One was the large number of samples analysed, which accounted for 4.1% of all patients with RP in Japan. This advantage enabled us to detect the causative genes and variants that are frequently found in Japanese patients with RP and identify pathogenic variants enriched in the unsolved patients with statistical evidence. The other strength was the high coverage of the target regions resulting from deep sequencing (mean depth >1000). Indeed, 99.6% of the targeted regions satisfied the criterion of ≥20 read depth, which enabled us to perform a comprehensive evaluation of known causative genes.

There are several limitations to this study. First, we only considered coding regions in the analyses presented here. It is possible that pathogenic variants in non-coding regions or transcripts that are expressed specifically in the retina can cause RP.35 Future integrative analyses including whole-genome sequencing and transcriptomic analysis of the human retina will address this point. Second, the short-read sequencing approach and analytical methods employed in the current study could not fully detect large SVs, as discussed above. In addition to the development and improvement of methods to detect SVs from whole-genome sequencing data, long-read sequencing and copy number analysis are warranted to further clarify the genetic basis of RP. Third, we excluded patients other than the proband of each pedigree at each facility, and there was no evidence of relatives as far as we examined the acquired clinical information. However, the relatives across facilities could not be evaluated because the genetic and clinical information was anonymised and there is an ethical restriction that individuals cannot be identified. Therefore, the possibility of existence of the relatives across facilities cannot be excluded. Fourth, we did not evaluate genotype–phenotype correlations in this study. Several further collaboration studies are required to collect more samples and clinical data in order to investigate genotype–phenotype correlations of each causative gene or variant. Fifth, the lack of pedigree analysis is also a limitation of this study. We defined the pathogenicity of variants based on the registration information of databases. Therefore, pedigree analysis is required to validate novel putative functionally deleterious variants. In addition, we checked the cis/trans states of pathogenic variants in the visual inspection by IGV as much as possible. However, we used short-read sequencing approach of NGS and it was impossible to confirm the csi/trans loci of two variants separated by more than the read length (151 bp). The confirmation of cis/trans also requires pedigree analysis.

In conclusion, we performed large-scale high-coverage targeted resequencing and revealed the genetic profile of RP in the Japanese population. Our results indicated that East Asian-specific variants in causative genes were the major causes of RP in Japanese patients. Moreover, we identified enriched pathogenic variants in known AR genes as the potential causes of RP in the genetically unsolved patients. Our findings provide insights into the differences in the genetic background of RP across populations and illuminate the utility of studying diverse populations to obtain a better understanding of the genetic aetiology of RP.

Supplemental material

Acknowledgments

We acknowledge the staff of the Laboratory for Statistical Analysis and Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences. We would like to show our appreciation to J Funatsu, T Tachibana, K Fujiwara and S Nakatake for collecting samples and clinical data in Kyushu University Hospital.

References

Footnotes

  • Contributors YKo, MA, KMN, YKa, YIk, MKub and K-HS designed the study. YKo, KMN, MKum, YMu, KO, TA, SK, DG, TH, KK, KH, SU, YH, AM, HT, YW, TN, TI, YIk and K-HS collected the samples and clinical data. YKo, MA, KMN, YMo, ST, CI, YIw and MKub performed the experiments and analysed the data. YKo, MA, KMN, YMo, YKa, MKub and K-HS contributed to the manuscript preparation and editing. All authors contributed to study conception and design, data interpretation, and approved the final manuscript.

  • Funding This work was supported by Japan Agency for Medical Research and Development grants 17ek0109213h0002 (to KMN) and JP17lk1403004 (to TA), and Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (grant 17K111447, to YH).

  • Competing interests SU reports personal fees from Nidek, Chuo Sangio, Santen and Alcon outside the submitted work. AM reports grants from Pfizer Japan, Abbott Japan, Otsuka Pharmaceutical, Eisai, Alcon Japan, Novartis Pharma KK, SEED and Santen Pharmaceutical, and personal fees from Lion Japan outside the submitted work. In addition, AM has a patent hydrogel contact lens for gene treatment licensed. HT reports personal fees from Rohto Pharmaceutical, Takeda Pharmaceutical, Mitsubishi Tanabe, AbbVie, Daiichi Sankyo, Chuo Sangio, Sanofi, Nihon Tenganyaku, Alcon Pharma, Bayer and Graybug Vision, grants and personal fees from Nidek, Otsuka, Pfizer, Santen, Alcon, Novartis, Senju, Kowa and Wakamoto, grants from HOYA and Allergan Japan, and personal fees and non-financial support from Carl Zeiss Meditec outside the submitted work. In addition, HT has a patent pending with Nidek.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the ethics committees of all the collaborating hospitals and was conducted in accordance with the tenets of the Declaration of Helsinki on biomedical research involving human subjects. Written informed consent was obtained from all subjects prior to participation in the study.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information.

  • Author note The Japanese Retinitis Pigmentosa Registry Project (https://secure2.visitors.jp/retinal_pigment/login/) will host the data from this research. At the time of publishing (June 2019), the website is under construction.