Background The decrease in sperm motility has a potent influence on fertilisation. Sperm motility, represented as the percentage of motile sperm in ejaculated sperms, is influenced by lifestyle habits or environmental factors and by inherited factors. However, genetic factors contributing to individual differences in sperm motility remain unclear. To identify genetic factors that influence human sperm motility, we performed a genome-wide association study (GWAS) of sperm motility.
Methods A two-stage GWAS was conducted using 811 Japanese men in a discovery stage, followed by a replication study using an additional 779 Japanese men.
Results In the two-staged GWAS, a single nucleotide polymorphism rs3791686 in the intron of gene for erb-b2 receptor tyrosine kinase 4 (ERBB4) on chromosome 2q34 was identified as a novel locus for sperm motility, as evident from the discovery and replication results using meta-analysis (β=−4.01, combined P=5.40×10−9).
Conclusions Together with the previous evidence that Sertoli cell-specific Erbb4-knockout mice display an impaired ability to produce motile sperm, this finding provides the first genetic evidence for further investigation of the genome-wide significant association at the ERBB4 locus in larger studies across diverse human populations.
- reproductive medicine
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Approximately 10% couples display infertility issues, and half of these problems are related to men.1 2 Male factor infertility may arise from various medical conditions such as spermatogenic failure, varicocele, obstructive azoospermia and congenital absence of vas deferens. Sperm motility—represented as the percentage of motile sperm in the ejaculated sperms—has a large influence on the fertilisation ability. Therefore, several studies are conducted to understand the factors that affect sperm motility.
Oxidative stress induced by alcohol consumption, cigarette smoking, obesity, diabetes, physical exercise, psychological stress, ageing, infection and environment factors (pollutants such as nitric oxide, lead and electromagnetic waves from cell phones) is one of the major factors responsible for the reduction in sperm motility.3 4 Genetic background has also been shown to be associated with sperm motility. The gr/gr subdeletion in the azoospermia factor c region of the Y chromosome was shown to be strongly associated with decreased sperm motility in men from Japanese population.5 Furthermore, polymorphisms in genes encoding cytochrome P450 family 19 subfamily A polypeptide 1,6 androgen receptor,7 follicle-stimulating hormone receptor,8 steroid 5α-reductase9 and oestrogen receptor10 11 were associated with sperm motility. These genes are related to the reproductive hormones and contribute to the testicular development and spermatogenesis; these genes have been proposed based on their functions. However, the genetic determinants for human sperm motility are poorly understood.
Genome-wide association study (GWAS) is an approach to find the genetic variations associated with disease or quantitative traits. To date, four GWASs associated with male infertility have been reported. These include the non-obstructive azoospermia or oligozoospermia in Caucasians or Chinese men12–15 and the family size or birth rate in Hutterite men in the USA.16 In the latter, 9 of the 41 single nucleotide polymorphisms (SNPs) were significantly correlated with the family size or birth rate and found to be associated with reduced sperm quantity and/or function in the subsequent validation study using 123 ethnically diverse men. However, there are no reports on GWAS of sperm motility. Here, we clarified the genetic determinants for human sperm quality by conducting a GWAS of sperm motility in 811 Japanese men, with a subsequent validation of the association in an additional 779 Japanese men.
We performed a two-staged genetic association study. The discovery stage included 816 men (20.7±1.7 years old, mean±SD) from the young Japanese population. These were recruited from university students in three study centres based in departments of urology at university hospitals in Japan (Kawasaki, Kanazawa and Nagasaki) as previously reported.17 The inclusion criteria were that the man was 18–24 years and that both he and his mother were born in Japan. The replication stage included 779 men (31.2±4.8 years old, mean±SD) of proven fertility recruited from the partners of pregnant women who attended obstetric clinics in four cities in Japan (Sapporo, Kanazawa, Osaka and Fukuoka).18 The inclusion criteria for the men were as follows: age 20–45 years and both he and his mother had to be born and live in Japan. In addition, the current pregnancy of the female partner had to be achieved by normal sexual relations and not as a result of fertility treatment. We excluded sample with complete deletion of AZF region in both subjects. The characteristics of the two-staged subjects are summarised in table 1Table 1. No difference in sperm motility was observed between the two-staged subjects. Some of the subjects in this study have been described in previous reports.19–26
Clinical trait measurements
The measurement of clinical trait of these subjects has been described in previous reports.17 18 Briefly, age, body weight, height and ejaculation abstinence period were self-reported. Body mass index (BMI) (kg/m2) was calculated from the body weight and height. Semen samples were obtained once by masturbation after sexual abstinence for at least 48 hours and ejaculated into clean, wide-necked, sterile, non-toxic collection containers. The samples were protected from extremes of temperature and liquefied at 37°C prior to their examination. At each semen collection site, sperm motility was assessed from 10 µL of well-mixed semen, which was placed on a clean glass slide, covered and examined at a total magnification of 400× at 37°C. Sperm motility (%) was calculated as ([number of motile sperm in the ejaculate]/[number of sperm in the ejaculate])×100. The motility assessment was repeated on a second 10 µL aliquot of semen and the average value calculated. Sperms were assessed using the WHO motility classes A, B, C and D,27 wherein sperms from classes A and B were considered as motile. Technicians from each centre were initially trained by one technician from St. Marianna University in Kawasaki, and these clinical trait measurements were similarly performed in both cohorts.
Genotyping, quality control and imputation
Genomic DNA was extracted from the peripheral blood samples of subjects using a QIAamp DNA blood kit (Qiagen, Tokyo, Japan). In the discovery stage, 816 men were genotyped using the Illumina HumanCore V.1.0 DNA Analysis Kit (Illumina, Tokyo, Japan) following the manufacturer’s instructions. We genotyped 298 930 SNPs, and the quality control of genotyped SNPs and samples was conducted using PLINK V.1.07 software package (http://pngu.mgh.harvard.edu/~purcell/plink/).28 Of the 816 samples, four were excluded because these were duplicates or familial relationships (PI_HAT>0.25), as revealed by pairwise identical-by-state/identity-by-descent estimation. Furthermore, we excluded one sample that was identified as a genetic outlier by the principle component analysis-based method using the genotype data of the HapMap CHB and JPT as the internal controls (online supplementary figure S1). Finally, 811 samples were included for genome-wide association analysis.
Supplementary material 1
For genotype imputation analysis, only non-redundant polymorphic SNPs with reference SNP (rs) IDs fulfilling the following criteria were included: (1) per-SNP call rate ≥0.98 and (2) P value for Hardy-Weinberg equilibrium (HWE) ≥10−6 in our sample set. Genotype data were flipped to forward strand with conform-gt, which is the utility program of BEAGLE V.4.1,29 30 using genotype data for Asian samples (JPT and CHB) of the 1000 Genomes Project31 32 as a reference panel. Imputation was performed with BEAGLE V.4.1, using the 1000 Genomes Project Phase 3 V.5 as a reference panel. We excluded SNPs with R2 <0.8 and all indels from the imputed genotype data to obtain genotypes for 3 901 256 SNPs, which were used for subsequent association analyses.
In the replication stage, SNP rs3791686 was genotyped using TaqMan probe (C_ 27517144_10; Applied Biosystems, Tokyo, Japan) with the ABI 7900HT real-time PCR system (Applied Biosystems). rs3791686 in randomly selected 100 samples of discovery subject was directly genotyped to confirm the concordance of the imputed results. The concordance of typing results between genotyped and imputed was 100%. The genotypes of rs3791686 were in HWE in a total of 1590 samples.
In discovery and replication stages, associations between each SNP and sperm motility were assessed using a multiple linear regression under an additive genetic model, with adjustments for age, BMI, ejaculation abstinence period and time from masturbation to semen evaluation using PLINK or R V.3.1.2 software package (http://www.R-project.org/). Since the raw value was closest to the normal distribution than some converted values, we decided to use the raw value for analysis in the present study. We set a suggestive threshold of P values <1×10−6 in the discovery stage. The results were combined in a meta-analysis using the meta package for the R software. The extent of heterogeneity among studies was quantified by the I2 statistic33 and statistically assessed by the Cochran’s Q test. No heterogeneity was observed in this study, as determined by the I2 statistic <50% or P value >0.1; hence, a fixed-effect model using the inverse variance method was used. Genome-wide statistical significance was considered at P values <5×10−8.
The Manhattan and quantile–quantile plots were generated using qqman package for the R software, while a regional plot was created by LocusZoom using the 1000 Genomes project Asian (ASN) data (November 2014).34 With the exception of annotations, linkage disequilibrium (LD) was calculated using PLINK software V.1.07 with genotype imputation data. Significant expression quantitative trait loci (eQTL) by SNP was searched on GTEx Portal database (http://www.gtexportal.org/home/).35 HaploReg V.4.1 (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php) was used for functional annotation analysis of variants,36 and RegulomeDB (http://regulome.stanford.edu/index) was employed to identify potential regulatory functions.37 Pathway analysis of GWAS datasets for sperm motility was conducted using iGSEA4GWAS V.2 (http://gesea4gwas-v2.psycho.ac.cn/).38 The SNPs were mapped to the nearest genes within 20 kb upstream/downstream and searched KEGG (http://www/genome.jp/kegg/), BioCarta (http://www.biocarta.com) and GO (http://www.geneontology.org) gene set/pathway databases. The threshold of false discovery rate (FDR)<0.05 is statistically significant in this analysis.
We conducted a two-staged GWAS to identify genetic loci associated with human sperm motility. We enrolled 816 Japanese men from the university students for the discovery stage and 779 Japanese men from the partners of pregnant women for the replication stage of GWAS. After quality control of samples using initially genotyped 298 930 SNP data in the discovery stage, 811 Japanese men were selected. We performed imputation analysis, which provided typed and imputed genotypes for 3 901 256 SNPs that passed quality control. Finally, 811 samples and 3 901 256 SNPs were included for the discovery stage. The characteristics of subjects are presented in table 1.
We performed GWAS between a total of 3 901 256 SNPs and sperm motility in 811 men in the discovery stage. Manhattan and quantile–quantile plots of GWAS are presented in figure 1 and online supplementary figure S2, respectively. The genomic inflation factor (λ) was reported to be 1.0, indicating the unlikelihood of the inflation of the false-positive association. The top 50 GWAS candidate SNPs for sperm motility were presented in online supplementary table S1. We failed to find any SNPs to reach a genome-wide significance level (P<5×10−8) in the discovery stage. When setting a suggestive significance threshold of P values <1×10−6, we identified that two SNPs, rs3791686 and rs1836719 on 2q34, were suggestively associated with sperm motility (β=−4.25, discovery P=4.47×10−7; β=−4.22, discovery P=5.29×10−7, respectively) (online supplementary table S1). These two SNPs are in strong LD (r2=0.99); thus, we selected only the most significant SNP (rs3791686) for the subsequent replication genotyping.
Supplementary Table 1
Supplementary material 2
In the replication study involving 779 proven fertile men, SNP rs3791686 on 2q34 showed a significant association with sperm motility (β=−3.51, replication P=3.88×10−3) (table 2). When we combined the discovery and replication results using meta-analysis, rs3791686 surpassed the threshold for genome-wide significance (β=−4.01, combined P=5.40 × 10−9), with no evidence of heterogeneity between the two studies. The variance in sperm motility explained by rs3791686 was 2.0%.
Figure 2 shows a regional association plot for the genomic region 400 kb upstream and downstream of the lead SNP rs3791686 in the discovery stage. Within the region, 24 genotyped and 289 imputed SNPs, including rs3791686, were associated with sperm motility, with discovery P values <0.05 from the association analysis in discovery stage (online supplementary table S2). The sperm motility-associated genomic interval indexed by rs3791686 on 2q34 overlapped with a single known gene, erb-b2 receptor tyrosine kinase 4 (ERBB4), while the lead SNP rs3791686 was located in the intron of ERBB4. Of a total of 313 SNPs with discovery P values <0.05 within the associated interval, none resulted in amino acid substitution or protein truncation or affected the splicing of ERBB4; one synonymous SNP (rs3748962) and seven SNPs in the 3′-untranslated region of ERBB4 were observed (online supplementary table S2). To obtain putative functional annotations of rs3791686 and other 13 SNPs in high LD (r2>0.80 in East Asians from the 1000 Genomes Project) with rs3791686 (online supplementary table S3) within the associated interval, we used the following three databases: GTEx Portal,35 HaploReg36 and RegulomeDB.37 We assessed if the 14 SNPs, including rs3791686 on 2q34, were involved in eQTLs using the GTEx Portal database and found that no significant eQTLs were associated with all the SNPs examined. HaploReg and RegulomeDB databases search revealed that the 14 SNPs examined within the associated interval may be regarded as candidate regulatory SNPs (online supplementary table S3). In the HaploReg database, the lead SNP rs3791686 itself was associated with enhancer histone marks and DNase I hypersensitive region in embryonic stem-derived cells and resided in regulatory motifs of four transcription factors—Maf, Nkx2, Nkx3 and TATA-binding protein (online supplementary table S3). Of the 13 SNPs in high LD with the lead SNP rs3791686, five were associated with enhancer histone marks and/or DNase I hypersensitive regions in various types of cells and tissues, while 12 SNPs had the potential to alter nucleotide sequences of several regulatory motifs. The RegulomeDB database provided the experimental evidence that three SNPs (rs13003941, rs1836720 and rs1836719) were located in DNase I hypersensitive and/or TF-binding regions in various cells. The iGSEA4GWAS analysis identified 421 significant pathway (FDR <0.05) (online supplementary table S4). Numerous pathways were identified in this analysis; this finding suggests that sperm motile ability is likely to affect by a complicated process involving interaction between multiple genes and pathways.
Supplementary Table 2
Supplementary Table 3
In the first two-staged GWAS of sperm motility in Japanese men, we identified a novel sperm motility-associated locus at ERBB4 on chromosome 2q34. The most strongly associated SNP was typed by imputation analysis. In this study, the subjects of discovery stage were genotyped using the Illumina HumanCore V.1.0 DNA Analysis Kit with a total of 298 930 SNPs. Subsequently, to enhance the coverage, untyped SNPs were imputed. Sometimes, imputation methods may be less accurate for typing of SNPs. To confirm the accuracy of this imputation method, randomly selected samples were directly genotyped for the GWAS-lead SNP rs3791686. The result of imputation analysis was validated by the genotyping.
SNP rs3791686 lies in the intron of ERBB4 gene, which is a member of the receptor tyrosine kinase family and epidermal growth factor receptor subfamily. ERBB4 is expressed in several tissues, including kidney, breast, cerebrum, heart, bone, ovary and testis. On activation by its ligands, ERBB4 forms a dimer on the cell surface. Following cleavage of the ERBB4 ectodomain by a disintegrin and metalloprotease domain 17 (ADAM17) and γ-secretase, the intracellular domain of ERBB4 is translocated into the nucleus. Inside the nucleus, ERBB4 is involved in the regulation of cell proliferation and differentiation.39–42 ERBB4 is thought to be both necessary and sufficient to trigger an antiproliferative response in human breast cancer cells.43 Kim et al44 reported that the SNP rs13393577 in ERBB4 is associated with breast cancer risk in Koreans by GWAS. In addition, previous GWASs in the National Human Genome Research Institue (NHGRI) GWAS Catalog demonstrate that SNPs in ERBB4 are genome-wide significantly associated with polycystic ovary syndrome (lead SNP rs1351592)45 and BMI (lead SNP rs7599312).46 The lead SNPs at ERBB4 from the previous GWASs are >1 Mb distally localised from the sperm motility-lead SNP rs3791686 and show no pairwise LD (r2 <0.01 in East Asians) with rs3791686. This indicates a novel association for sperm motility at ERBB4 on 2q34, which is independent of other human diseases and traits.
The expression of ERBB4 is evident in male reproductive tissues, including testis. In the testicular tissue, ERBB4 is expressed in both somatic cells (Sertoli cells and Leydig cells) and germ cells.47 It is notable that Sertoli cell-specific Erbb4-knockout mice exhibit a developmental defect in the organisation of the testicular seminiferous tubules, which reduces male fertility. Aberration in the testicular cell adhesion machinery caused by Erbb4 deficiency leads to a compromised capacity of the testes to produce motile sperms.47 Thus, ERBB4 signalling in the Sertoli cells may influence the sperm motility, suggestive of the promising functional role of ERBB4 in sperm motility. The lead SNP rs3791686 identified in this GWAS is an intronic SNP of ERBB4 and displays the potential to act as a functional regulatory SNP based on the multiple functional annotations. As the functional annotation analyses reveal an association between other SNPs in high LD with rs3791686 and potential regulatory domains and motifs, the sperm motility locus at ERBB4 may have a role in the regulation of ERBB4 expression via a cis-regulatory mechanism. Sandholm et al,48 reported that a cis-eQTL for ERBB4 in tubulointerstitial-enriched kidney biopsies maps to intronic ERBB4 SNPs, rs17418640 and rs17418814. Both of these SNPs are proxies for rs7588550, representing a suggestive association with diabetic nephropathy; however, these eQTL SNPs are not in LD (r2<0.01 in East Asians) with the sperm motility-lead SNP rs3791686. Further studies are warranted to assess the potential contribution of the sperm motility-associated locus indexed by rs3791686 to the regulation of EBRR4 expression. These studies will also help explore the possible involvement of this locus in the expression regulation on a genome-wide scale via transregulatory mechanisms.
Liu et al49 have reported that five SNPs (rs215702, rs6476866, rs10129954, rs2477686 and rs10841496) were significantly correlated with sperm progressive motility. However, present study did not detect the variants associated with sperm motility including the region 400 kb upstream and downstream of these five SNPs. Previously, we also have reported that four SNPs as being significantly associated with risk factors for non-obstructive azoospermia (NOA) by Chinese GWAS13 were not associated with NOA in Japanese population.21 The reason for these may be that there are small genetic differences between Han Chinese and Japanese population by a principal component analysis using genotype data of the HapMap CHB and JPT (online supplementary figure S1). Additionally, we found a strong association between Y-haplogroup and sperm motility in the same Japanese populations.22 However, none of the SNPs on Y chromosome display a significant association (P<0.05) with sperm motility in this study. The Illumina Human Core V.1.0 DNA analysis kit includes 1943 Y-chromosome markers. However, of these, only 177 markers could be examined in the discovery stage. Because this kit does not include Japanese Y-haplogroup specific markers, we did not find a significant association between Y-chromosome variants and sperm motility in this study.
Several limitations of this study should be noted. In this study, men of proven fertility were used, instead of randomly selected subjects as the replication samples. These were the only samples available for the current replication analysis. Using samples selected on the basis of fertility may cause bias. In fact, abstinence periods were significantly different between two cohorts (table 1). In general, longer abstinence period is correlated with lower sperm motility. As the previous study described, abstinence period was nagatively correlated with sperm motility in both cohorts.22 To reduce the influence of the abstinence period on sperm motility, we included this as a covariate for a multiple linear regression analysis. Therefore, we think that the effect of abstinence period on the power to detect sperm motility-associated SNPs is minimized in this study. Additionally, all the participants in the current two-staged GWAS were Japanese men. Independent validation studies are required to test the observed association between ERBB4 SNPs and sperm motility using other general populations and ethnicities. The transethnic association analyses at the ERBB4 locus will also enable us to narrow the association signal to smaller sets of SNPs, when leveraging differences in LD structures across diverse populations. The limited statistical power of this two-staged GWAS prevented the detection of other true positive associations at a genome-wide significance level because the sample size was not large. We believe that other genetic loci may account for the interindividual variation in sperm motility, and therefore, larger scale GWAS analyses may be expected to identify novel associations between genetic variants and sperm motility.
It is one of the limitations that sperm motility may show sometimes intraindividual variation between samples from the same individual. When phenotypic repeatability is low, setting the upper boundary of heritability of a trait may decrease sensitivity to detect genetic variant/variants associated with a trait. As aforementioned, sperm motility depends on the abstinence period; in general, abstinence period and sperm motility shows a negative relationship. In our samples, although there is a difference in the strength of association, the abstinence period was indicated to be negatively correlated with sperm motility in both cohorts,22 which is not contradictory. In this study, we set a significance threshold of P values <1×10−6 in discovery stage and performed the replication analysis of the selected SNP. The strength of the SNP-trait association between cohorts was slightly different, but there was no significant heterogeneity. As well as intraindividual variation of sperm motility between individual samples, the measurement of sperm motility may have variability by operators (individual technicians). To reduce the between-centre variability, technicians from each centre were initially trained by one technician from St. Marianna University in Kawasaki. In addition, to statistically reduce the influence of differences in sperm assessment between the centres, we added each centre as a covariate and further conducted an association analysis between sperm motility and rs3791686. We found that rs3791686 was associated with sperm motility in the discovery stage (β=−4.35, P=1.62×10−7) and in the replication stage (β=−3.16, P=0.012). When we combined two results using meta-analysis, rs3791686 was genome-wide significantly associated with sperm motility (β=−3.99, P=6.60×10−9). This finding was very similar to the result (table 2) from the association analysis without adjustment for semen analysis centre. Although the measurements of the semen analysis may not necessarily be representatives of individual sperm motility, together with the previous finding of Sertoli cell-specific Erbb4-knockout mice, we are confident that the results of our GWAS are valid.
In conclusion, this first two-staged GWAS for sperm motility identifies a novel sperm motility-associated locus at ERBB4 on 2q34. The genetic evidence suggests that ERBB4 is a promising candidate for future association studies in diverse populations with larger sample sizes. Further studies such as fine-scale genetic mapping are needed to uncover a functional variant at this locus as well as the underlying molecular mechanism.
We thank all the volunteers who participated in this study. We are grateful to the late Professor Yutaka Nakahori and Professors Eitetsue Koh, Jiro Kanaya, Mikio Namiki,Kiyomi Matsumiya, Akira Tsujimura, Kiyoshi Komatsu, Naoki Itoh andJiro Eguchi for collecting blood samples from the participants. We also thank Professor Toyomasa Katagiri for his assistance with the AB GeneAmp PCR System 9700.
YS and AT contributed equally.
Contributors YS and AT conceived, designed the experiments, performed the experiments and wrote the paper. TS performed the imputation analysis. SN, MY and TI prepared and collected samples. II contributed material and analysis tools. YS, AT, TS, II, AY and TI reviewed and revised the manuscript.
Funding This study was supported in part by the Ministry of Health and Welfare of Japan (1013201) (to TI), Grant-in-Aids for Scientific Research (C) (26462461) (to YS), (23510242) (to AT) and Grant-in-Aids for Scientific Research (B) (17H04331) (to YS), (15H04320) (to AT) from the Japan Society for the Promotion of Science, the European Union (BMH4-CT96-0314) (to TI), the Takeda Science Foundation (to AT) and The Suzuki Urinary Foundation (to YS).
Competing interests None declared.
Patient consent Obtained.
Ethics approval This study was approved by the ethics committees of the University of Tokushima and St. Marianna Medical University. All participants provided written informed consent.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.