Article Text

Download PDFPDF

The human Y chromosome’s azoospermia factor b (AZFb) region: sequence, structure, and deletion analysis in infertile men
  1. A Ferlin1,
  2. E Moro1,
  3. A Rossi1,
  4. B Dallapiccola2,
  5. C Foresta1
  1. 1University of Padova, Department of Medical and Surgical Sciences, Clinica Medica 3, Centre for Male Gamete Cryopreservation, Via Ospedale 105, 35128 Padova, Italy
  2. 2University of Roma “La Sapienza”, Institute of Medical Genetics and Institute CSS-Mendel, Viale Regina Margherita 261, 00198 Roma, Italy
  1. Correspondence to:
 Professor C Foresta, University of Padova, Department of Medical and Surgical Sciences, Clinica Medica 3, Via Ospedale 105, 35128 Padova, Italy;


Microdeletions of the Y chromosome long arm are the most common mutations in infertile males, where they involve one or more “azoospermia factors” (AZFa, b, and c). Understanding of the AZF structure and gene content and mapping of the deletion breakpoints in infertile men are still incomplete. We have assembled a complete 4.3 Mb map of AZFb and surrounding regions by means of 38 BAC clones. The proximal part of AZFb consists of large repeated sequences organised in palindromes, but most of it is single copy sequence. A number of known and novel genes and gene families map in this interval, and most of them are testis specific or have testis specific transcripts. STS mapping allowed us to identify four severely infertile subjects with a deletion in AZFb with similar breakpoints, therefore suggesting a common deletion mechanism. This deletion includes at least five single copy genes and two duplicated genes, but does not remove the historical AZFb candidate gene RBMY1. These data suggest that other genes in AZFb may have important roles in spermatogenesis. We had no evidence for homologous recombination between large repeats as a possible deletion mechanism, as shown for AZFa and AZFc. However, identical sequences in AZFb and AZFc exist, and this finding could explain deletions found in these regions.

  • azoospermia factor b
  • male infertility
  • RBMY
  • Y chromosome

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Interstitial microdeletions in the euchromatic portion of the Y chromosome long arm (Yq) occur in 10–15% of idiopathic primary testiculopathies (azoospermia and severe oligozoospermia).1 Three non-overlapping regions, referred to as “azoospermia factors” (AZFa, b, c from proximal to distal Yq) have been defined as spermatogenesis loci.2 Over the last few years many studies have roughly defined the boundaries of these regions and different sequence tagged sites (STSs) maps of these intervals have been created. Recently, the Y chromosome sequencing project has provided a large amount of data and a refined and meaningful map of the Y chromosome contigs was assembled.3 Among the three AZF intervals, AZFa and AZFc have been extensively studied in terms of sequence, physical map, gene content, and genomic organisation, deletion analysis in infertile men, and deletion mechanism. In particular, AZFa contains two genes whose absence or mutation cause spermatogenic failure, USP9Y and DBY.4–7 Most AZFa deletions arise from recombination between two 10 kb direct repeats that are 800 kb apart.8–10 Similarly, AZFc has been recently assembled in a precise map11 and shown to contain a high number of testis specific gene families other than the historical AZFc candidate gene DAZ.12 This region consists almost entirely of very long repeat units and most AZFc deletions seem to result from homologous recombination between two 229 kb direct repeats determining the loss of a 3.5 Mb segment.

However, the sequence and structure of AZFb have not been precisely determined, nor has definitive proof of AZFb candidate genes been produced yet. The AZFb region spans approximately intervals 5M-6B,13 but this distance varies between subjects and according to the screening methodology.14 AZFb deletion is quite rare in the male infertile population, but this figure is higher when restricted criteria are used to select patients (1–5%).1,14 Initial studies have suggested that the RBMY1 gene family may represent the AZFb candidate, and this hypothesis is supported by its germ cell specific expression15 and its homology with mouse Rbm, mutation of which causes spermatogenic arrest.16 However, deletions within AZFb not removing RBMY1 have been reported17–20 and hitherto no RBMY1 point mutation has been identified. Moreover, other genes, such as SMCY and EIF1AY, have been mapped in this region, but their role in the spermatogenic process is still unclear.21

Although the STSs mapping was quite approximate, some of the reported patients with AZFb deletion not involving RBMY1 appear to have similar proximal and distal breakpoints.18–20 According to that described for AZFa and AZFc, this observation points to a possible similar mechanism causing AZFb deletions. With this in mind, and having identified four unrelated azoospermic and severely oligozoospermic subjects with similar breakpoints inside AZFb, we describe the complete sequence, genomic organisation, and structure of AZFb and surrounding regions.


AZFb patients

Hospital Ethical Committee approval and informed consent were obtained for all subjects in this study. Seven hundred infertile patients affected by azoospermia or severe oligozoospermia (sperm count below 5 million/ml) have been analysed by routine diagnostic Y chromosome multiplex PCR screening (markers sY14-SRY, sY86, DF3.1 for the USP9Y gene, DBY1 for the DBY gene, sY95, sY117, sY125, sY127, F19/E355 for the RBMY1 gene, and sY254, sY255 for the DAZ gene). Four patients (Nos 317, 529, 621, and 820) showed absence of sY117, sY125, and sY127 and were analysed further.

Patient 317 has already been reported as severely oligozoospermic (sperm count <5 × 106/ml).20 His medical history indicated unilateral cryptorchidism with bilateral testicular damage characterised by severe hypospermatogenesis. Infertility was considered idiopathic for the other three unrelated AZFb deleted patients, since no obvious cause determining the testicular damage could be ascertained. All patients had a normal 46,XY karyotype. Repeated semen analyses in patients 529, 621, and 820 showed azoospermia. Bilateral testicular fine needle aspiration cytology showed Sertoli cell only syndrome in two cases (Nos 529 and 820) and severe hypospermatogenesis in the other two cases (Nos 317 and 621).22

Mapping, sequence analysis of AZFb and the surrounding region, and characterisation of deletion breakpoints

Using data available from GenBank for the human Y chromosome we performed electronic analysis on 38 BAC sequences contained in contig NT_011875, the last 4.3 Mb of which overlaps the region of interest. The BLASTN program23 was used for sequence analysis.

Genomic DNA was prepared from different fresh blood samples, using a DNA isolation kit (Roche, Milan, Italy). PCR conditions and primer sequences for all STSs used are deposited at GenBank. New PCR primers were designed from the sequence of the BAC clones 468D10 (sY1217 and sY1218) and 209I11 (sY1206, sY1211, and sY1207) and sequences and thermocycling conditions have been deposited in GenBank, where accession numbers are as follows: G68329, sY1207; G68330, sY1211; G68331, sY1206; G68332, sY1217; G68333, sY1218.


AZFb patients

By testing a total of 700 infertile men with severe spermatogenic failure (non-obstructive azoospermia and severe oligozoospermia) for Yq microdeletions, we identified four unrelated subjects (Nos 317, 529, 621, and 820) with “partial” AZFb deletions and apparently similar breakpoints. This screening was performed by diagnostic multiplex PCR including the following STSs: sY14 for SRY, sY86, DF3.1 for USP9Y, DBY1 for DBY, and sY95 in the AZFa region; sY117, sY125, sY127, and F19/E355 for RBMY1 in the AZFb region; sY254 and sY255 for DAZ in the AZFc region. The deletion found included STS sY117, sY125, and sY127, and was further confirmed with markers sY131, sY129, and sY113. The proximal breakpoint was initially mapped with additional markers between sY108 and sY113 and the distal breakpoint between sY129 and sY134 (fig 1). Out of the 700 subjects, another eight were found to carry “complete” AZFb deletions, starting at sY108 and also including RBMY1. The four “partial” AZFb deletions therefore lie entirely within these larger deletions. These results were repeatedly confirmed in the four subjects on separate blood collections. In two of them (Nos 317 and 820) a paternal sample was available and all STSs amplified normally, confirming that these deletions occurred de novo, while no paternal sample was available in cases 529 and 621.

Figure 1

Schematic picture of the human Y chromosome and AZFb deletion found in the four subjects. Top: representation of the Y chromosome with previously mapped genes. AZF regions are indicated. Bottom: magnification of AZFb and surrounding regions with the relative position of the STSs used for the preliminary screening. Genes are indicated in the 5′-3′ orientation by black triangles. Immediately below are indicated the results of PCR analysis (plus, normal amplification; minus, no amplification).

Two men (Nos 529 and 820) had azoospermia with a testicular cytological picture of Sertoli cell only syndrome, that is, no germ cells were found in their testes, whereas the other two were affected by severe hypospermatogenesis, that is, strong reduction in germ cell number without alteration of the spermatogenic maturation process.

Genomic sequence analysis of AZFb and surrounding region

Thanks to the Human Genome Sequencing project, it was possible to cover the genomic Y sequence of AZFb and surrounding regions with a contig of 4.3 Mb made up with 38 BAC sequences using data available at the Entrez Genome View of the human Y chromosome and BLASTN analysis23 of clones deposited there. With the complete nucleotide sequence of this region, we first determined its organisation and gene content. Two human Y chromosome working draft sequences cover Yq euchromatin: contig NT_011875 of 9.9 Mb covers AZFa and AZFb regions, and contig NT_011903 of 4.9 Mb covers AZFc. Examination of the sequence of AZFb and surrounding regions showed some differences from the map provided by Tilford et al,3 mainly because repeated segments have been recognised. In particular, the proximal 1.4 Mb is arranged in three families of repeats, coloured in red (A1 and A2), yellow (B1 and B2), and blue (C1 and C2) in fig 2. Repeat A is 456 kb, repeat B is 39 kb, and repeat C is 190 kb in length. Two copies of repeat C form an inverted duplication separated by a unique sequence (U in fig 2) of 40 kb. Two copies of repeats A and B form an inverted palindrome extending approximately 1 Mb and with arm to arm identity of 99.97%. Repeats B1 and B2 are separated by a short (3458 bp) single copy sequence, while repeats A2 and C1 are separated by only 12 bp. The distal external boundaries of repeat A and of repeat C are delimited by the markers sY108/sY801 and sY113, respectively. BAC clone 529I21 starts at nucleotide 5.687.523 of contig NT_011875, repeat A1 at nucleotide 5.713.048, and repeat C2 ends at nucleotide 7.127.428. Distally to repeat C2 to the end of the contig (2.8 Mb), the sequence did not show any other significant repeat. The inverted duplication formed by repeats A and B are not properly contained in the AZFb region, and therefore the latter is estimated to extend over 3.2 Mb. Therefore, repeated sequences (repeats C) represent only 12% of the AZFb region.

Figure 2

Map of AZFb and surrounding regions. From top to bottom: genes and gene families, organisation of the repeats, BAC clones, STS map, and deletion interval in the four subjects. Genes are represented in their 5′-3′ orientation by black triangles when their mapping in GenBank is confirmed by alignment of mRNAs and/or ESTs, or by white triangles when mapping is supported only by Genscan. Repeats A and the STSs mapping here are in red, repeats B in yellow, repeats C and their STSs in blue, and the unique sequence between C1 and C2 and its STS in black. Marker sY2573 mapping in the single copy sequence between B1 and B2 is in black. The deletion interval found in the four subjects is represented by a solid line (no amplification by STSs) while the proximal and distal breakpoints are represented by a discontinuous line and detailed in figs 3 and 4.

A number of genes and gene families have been mapped or are predicted to map to the 4.3 Mb contig (more than 30 genes are present in Entrez Genome and represented in fig 2), and the two large inverted duplications in the proximal part of the 4.3 Mb contig are rich in genes (figs 2 and 3). However, only eight single copy genes (LOC170324, SMCY, EIF1AY, RPS4Y2, GAPD-similar, TSPYq1-similar, RBMY1A1, and TTY13) and five duplicated genes (USP9Y-similar, XKRY, CDY2, HSFY, and LOC140017/140020) are deposited in GenBank as confirmed genes based on alignment of mRNA and/or EST to the genomic sequence (black triangles in fig 2). Only these genes are reported in table 1.

Table 1

Summary of genes mapping in AZFb and surrounding regions. Only genes annotated in GenBank as genes confirmed by alignment of mRNAs and/or ESTs are reported

Figure 3

Detailed map of the proximal breakpoint region. From top to bottom: genes and gene families, organisation of the repeats, BAC clones, STS map, and proximal deletion breakpoint in the four subjects. Legend as for fig 2.

Analysis of deletion breakpoints

We then set out to define more precisely the proximal and distal breakpoints in the four patients. To do this we mapped a number of previously described STSs and generated new markers (figs 2, 3, and 4). The proximal breakpoint (fig 3) was supposed to lie between marker sY108 (last marker with normal amplification) at the distal boundary of repeat A1 and A2 (clone 529I21 and 157F24) and marker sY113 (first marker with absent amplification) at the distal boundary of repeat C1 and C2 (clone 945E12 and 143C1). We further tried to delimit the proximal breakpoint better by testing our patients for two new STSs, sY1217 (which amplified normally) and sY1218 (which failed to amplify) mapping at the very distal end of A2 and C1, respectively. However, we could not discover by PCR whether the deletion included repeats B2 and A2 since all the markers in this region are duplicated. We used sY2573 that maps in the 3458 bp single copy sequence between B1 and B2 and we obtained normal amplification. With these results it cannot be determined whether the deletion includes repeats A2 and B2 or not. We performed quantitative PCR experiments with sY1217 and sY14 for SRY (as internal standard), using DNA from the AZFb deleted patients and a fertile man. A comparable signal intensity of the amplified fragments was obtained in the exponential phase (data not shown). These data, although not definitive, suggest that the proximal breakpoint starts in BAC clone 468D10 between marker sY1217 and sY1218 (between nucleotides 6.690.566 and 6.707.639 of contig NT_011875), that is, at the A2/C1 junction.

Figure 4

Detailed map of the distal breakpoint region. From top to bottom: genes, BAC clones, STS map, and distal deletion breakpoint in the four subjects. Legend as for fig 2.

We then examined the distal AZFb breakpoint (fig 4), where the first positive (normal amplification) STS was sY134, by testing additional markers sY132, sY133, sY138, and sY136, and designing STSs sY1206, sY1211, and sY1207 in BAC clone 209I11. No amplification was obtained with sY1206 and sY1211, while normal amplification was obtained with the other markers. These results suggested that the distal AZFb breakpoint mapped in the 173 bp between sY1211 and sY1207 at nucleotides 18.513–18.686 of BAC clone 209I11, that is, at nucleotides 9.512.905–9.513.078 of contig NT_011875. The deletion intervals in the four patients was therefore estimated to be 2.8 Mb or 3.3 Mb if we assume that repeats B2 and A2 were also deleted.

Considering only confirmed genes, the deletion removed five single copy genes (LOC170324, SMCY, EIF1AY, RPS4Y2, and GAPD-similar), and two duplicated genes (HSFY and LOC14007/140020) (fig 2). If repeats B2 and A2 are also considered absent, one copy of three duplicated genes is deleted too (CDY2, XKRY, and USP9Y-similar).

We searched for a possible deletion mechanism for our patients. In particular, we looked for a possible recombination mechanism between large direct or inverted repeats, as a similar mechanism was described for AZFc and AZFa deletions.8–11 However, no significant homology was found between the sequence of repeats A, B, or C (or part of them) and the sequence near the distal breakpoint. On the contrary, a 112 kb region overlapping the end of repeat A1 and the initial part of repeat B1 (and the corresponding region in B2/A2) (6.079–6.190 Mb and 6.229–6.340 Mb of contig NT_011875) is almost identical (97% identity) to two sequences in the AZFc region contained in contig NT_011903 (2.186–2.298 and 3.771–3.883 Mb of contig NT_011903). These regions correspond to part of the yellow repeats described in the map of Kuroda-Kawaguchi et al,11 and in particular to the regions where CDY1 genes map. In fact, in the corresponding AZFb region of contig NT_011875, two CDY1 genes are predicted to exist by the Genscan model (fig 2). Then we looked for repetitive elements (such as Alu, LINE, etc) localised near the proximal and distal breakpoints by use of RepeatMasker software ( in order to identify a possible deletion mechanism involving these repetitive DNA elements. Although interspersed repetitive sequences were found near the breakpoints, including an Alu sequence in the distal 173 bp interval, no region could be found showing a high degree of similarity between the proximal and distal breakpoint regions.


Screening for Yq microdeletion allowed us to identify four severely infertile subjects with apparently similar breakpoint deletion in AZFb not removing the candidate AZFb gene RBMY1. We considered this observation very important for two main reasons. First, the apparently similar deletion breakpoints could suggest a deletion mechanism similar to that found for AZFa and AZFc,8–11 that is, homologous recombination between direct repeats. Second, the normal presence of RBMY1 suggested that other known or unknown genes could be responsible for the testicular phenotype and therefore that other AZFb candidate genes could be present. Thanks to the Human Genome Project, we therefore assembled a complete map of AZFb and the 4.3 Mb surrounding, determining its structure and gene content, and then we analysed in more detail the deletion breakpoints and possible deletion mechanisms.

AZFb showed proximally a structure that resembles that of AZFc,11 with large direct and inverted repeats organised in palindromes, but the most part of it consists of single copy sequence. AZFb as former defined13 actually extends for 3.2 Mb. By using a number of already described and novel markers, we tried to determine the deletion breakpoints, mapping the distal one in a 173 bp region between sY1211 and sY1207. Identification of the proximal breakpoint was more difficult because the presence of the repeats prevents detailed analysis. We were unable to isolate and sequence the junction fragments, which is usually required for the identification of deletion breakpoints. The deletion was estimated to span at least 2.8 Mb, but could theoretically extend to 3.3 Mb if repeats B2 and A2 were also removed. A number of genes and gene families are predicted to map in this region, but mapping is supported by alignment of mRNAs and/or ESTs for only eight single copy genes and five duplicated genes. The deletion found in our patients removes at least five single copy genes (LOC170324, SMCY, EIF1AY, RPS4Y2, and GAPD-similar) and two duplicated genes (HSFY and LOC140017/LOC140020), but may also remove one copy of three duplicated genes (USP9Y-similar, XKRY, and CDY2) if repeats B2 and A2 are deleted, therefore extending proximally to AZFb. Only XKRY, CDY2, SMCY, EIF1AY, and RBMY1A1 have been previously described and all but SMCY show testis specific expression or testis specific transcripts. Other genes are novel and their expression or function are not known. It is therefore possible that the testicular phenotype observed in our patients is the consequence of the absence of many genes with a putative role in spermatogenesis. Until further studies are conducted on these genes, they can all be considered as putative AZFb candidate genes in addition to RBMY1A1. It cannot be excluded that the deletion deregulates RBMY1A1 expression by a position effect. However, RBMY1A1 regulatory elements have not yet been identified and this hypothesis should be investigated.

Among the known genes, some evidence suggests that SMCY has no role in spermatogenesis; it is ubiquitously expressed and encodes a histocompatibility antigen, and an SMCY transgene does not restore spermatogenesis in XSxrbO male mice.24 However, EIF1AY and XKRY are more interesting as they show testis specific transcripts.21EIF1AY produces, by alternative splicing, eight different transcripts encoding five different proteins that are predicted to localise in the cytoplasm where they seem to be required for protein biosynthesis, enhancing ribosome dissociation into subunits, and stabilising the binding of the initiator Met-tRNA to 40S ribosomal subunits.25 Recent evidence supports a role of CDY2 in male germ cell development, as it has been shown that CDY may be responsible for the histone to protamine transition that occurs in spermiogenesis (formation of mature sperm from spermatids).26 The role of the USP9Y-similar gene has to be determined, and in particular it would be interesting to verify whether its function in deubiquitination is similar to that observed for the AZFa candidate USP9Y.27 Among the novel genes, HSFY deserves further study. Although no phenotype has yet been reported, and its in vivo function is as yet unknown, the importance of this gene derives from the observation that most ESTs related to it are expressed in the testis. HSFY contains a heat shock factor type DNA binding domain related to the HSF2 gene on chromosome 6. It is therefore predicted to function as a transcriptional activator specifically binding to heat shock promoter elements. HSFY produces, by alternative splicing, two different transcripts encoding two proteins. Alternative transcript 1 contains 401 amino acids and contains two HSF type DNA binding domain motifs. Alternative transcript 2 encodes a protein with 203 amino acids that does not contain particular motifs. Very little information exists for RPS4Y2 and GAPD-similar genes other than that described in table 1, and nothing is known regarding LOC140017/140020 and LOC170324.

We did not find a relationship between genotype and testicular phenotype in our patients. Despite an apparently identical deletion, two of them showed complete absence of germ cells in their testes (Sertoli cell only syndrome) and two a reduction of germ cells (severe hypospermatogenesis). One hypothesis is that additional genetic or environmental factors may have contributed to the phenotype, but it is also possible that the deletions were actually different in size in these patients, including for example one copy of CDY2, XKRY, and USP9Y-similar genes (repeat A2) in those subjects with a more severe phenotype.

Most AZFa and AZFc deletions arise from recombination between large direct repeats.8–11 This seems not to be the case for our AZFb deletions, since no significant homology between sequences at the proximal and distal breakpoints was found. Interestingly, however, a duplicated 112 kb segment corresponding to proximal AZFb was found identically twice in AZFc, mapping among the yellow repeats described in the map of Kuroda-Kawaguchi et al.11 Recombination between these direct repeats could theoretically explain some deletions involving the AZFb and AZFc regions.1 This should be considered in future studies.


The financial support of Telethon-Italy to Carlo Foresta (grant No E.C0988) and of the University of Padova to Alberto Ferlin are gratefully acknowledged. GenBank accession numbers: G68329, G68330, G68331, G68332, G68333.