Background

Human spermatogenesis is an intricate biological process that begins with the mitotic division of spermatogonia to give rise to primary spermatocytes which in turn undergo the first meiotic division to form secondary spermatocytes. After a second meiosis cycle, these secondary spermatocytes produce haploid cells called round spermatids, which subsequently form elongated spermatids which finally differentiate into mature spermatozoa. The spermatogenic process relies on the concerted actions of various hormones, local secretory factors and testis-specific genes. Defects at any of these levels can lead to accumulation of errors resulting in impaired spermatogenesis leading to male infertility. According to the World Health Organization [1], male infertility refers to the inability of the male partner to cause pregnancy in a clinically normal female. Almost 30 million males worldwide are infertile with the largest niches of male infertility occurring in central and Eastern Europe [8–12%] and Australia [8–9%] [2]. As per WHO 2010 male infertility can be classified based on seminogram under following categories:

  • Azoospermia- Absence of sperm in the ejaculate, it can be classified as obstructive azoospermia [OA] where absence of sperm in the ejaculate is observed as a result of problems in sperm delivery or non-obstructive azoospermia [NOA] where there is absence of sperm in the semen due to abnormal sperm production. NOA constitutes 60% of all cases of azoospermia

  • Oligozoospermia- Less than 15–20 × 106 spermatozoa in the ejaculate

  • Severe oligozoospermia- Less than 5 × 106 spermatozoa in the ejaculate

  • Normozoospermia- Normal values of sperms in the ejaculate

  • Asthenozoospermia- Low levels of motility observed in less than 50% of sperms

  • Teratozoospermia- Less than 30% of sperms have normal morphology

  • Aspermia- Failure in ejaculating semen

Causes of male infertility

Male infertility can be attributed to several factors such as cryptorchidism [absence of one or both testes in the scrotum], varicocele [abnormal enlargement of the pampiniform venous plexus in the scrotum], endocrinological disorders, obstruction/absence of seminal pathways, infections, alcohol consumption or chemotherapy [3, 4]. However, genetic alterations have also emerged as one of the leading cause of male infertility. Genetic defects commonly observed in infertile males include karyotypic abnormalities, gene copy number variations [CNVs], single gene mutations/polymorphisms and deletions on the long arm of the Y chromosome [Yq microdeletions]. These genetic defects impede the development of the male gonads or urogenital tract during development, cause arrest of germ cell production and/or maturation or produce non-functional spermatozoa. Amongst the various factors, karyotypic abnormalities and Yq microdeletions are the leading genetic causes of male infertility. In this review, we present the current knowledge of the human Y chromosome, its genes and how the defects in these genes lead to male infertility.

The human Y chromosome

Mammalian sex chromosomes evolved from autosomes at least 180 million years ago. The first step in differentiation of the Y chromosome involved the acquisition of the testis-determining gene followed by large-scale inversions and sequential suppression of recombination between the X and Y chromosomes in a stepwise fashion [5,6,7]. A detailed overview on the evolution of the human Y chromosome and its present day status has been a subject of recent reviews [5]. Cytogenetically, the human Y is an acrocentric chromosome composed of two pseudoautosomal regions (PARs), a short arm (Yp) and the long arm (Yq) that are separated by a centromere (Fig. 1). While the PARs and the short arm are euchromatic, a large portion of the long arm is heterochromatic with the exception of the proximal portion juxtaposed to the centromere which is euchromatic in nature.

Fig. 1
figure 1

Structure of the human Y chromosome. The Pseudo Autosomal Regions [PAR1 and PAR2] are located at the terminal ends of the Y chromosome. The green boxes show the genes encoded in these regions. Yp is the short arm of the Y chromosome and the genes within it are show in the peach box. The long arm, Yq, is composed of both euchromatin and the genetically inactive heterochromatin regions. This region contains the Azoospermia factors AZFa, AZFb and AZFc. The pink box shows the genes in the AZFa region. The heterochromatin is not known to harbor any known genes. The region beyond the PAR is termed as Male Specific Region on Y (MSY)

The pseudo autosomal regions (PARs)

The PAR1 and PAR2 of the Y chromosome are short regions of homology between the mammalian X and Y chromosomes; the PAR1 is located on the tip of the p arm and the PAR2 on the tip of the q arm (Fig. 1). Due to the diversity in the genetic sequences of the X and Y chromosomes, they do not undergo pairing during meiosis, except for the regions in PAR which pair and recombine with the PARs in the X chromosome during meiosis. However, the recombination and pairing of PARs is temporally and genetically distinct from that of the rest of the genome [8]. The PARs have a delayed double stranded break (DSBs) formation and pairing as compared to autosomes due to the fact that the PARs initiate DSB formation only after all the autosomal DSBs have been repaired.

Despite this delayed recombination, the cross over rate in PAR1 occurs more rapidly than it does on the autosomes [9]. In mice, on average, one meiotic DSB forms every 10 mega base pairs [Mb] while in the PARs it spans a mere 0.7 Mb [10]. It is proposed that in the PARs, deoxyribonucleic acid [DNA] is packaged into several small loops as compared to fewer but larger loops in the autosomes, thus allowing greater DSB formation in the PARs. Indeed, reports indicate that PAR1 contains several recombination hotspots, which have activities that differ significantly among human populations [11]. The genetic control of DSB formation in PARs is also distinct from that of autosomes. Autosomal pairing is dependent on SPO11βm while the X-Y paring is SPO11α dependent. It has been observed that PAR DSBs are not formed in mice that lack SPO11α leading to X-Y mis-segregation and male infertility, the autosomal DSBs are somehow not majorly affected [8]. Interestingly, studies have also associated deletion of human PAR1 with total male sterility [12, 13], reduced recombination in PAR1 has been associated with increased frequency of sex chromosome aneuploidy in sperm, leading to X-chromosome monosomy (Turner syndrome) or XXY (Kleinfelter syndrome) in the offspring [14, 15]. These observations suggest that although the Y chromosome PARs behave like autosomes, the mechanisms by which they do  so is different and also has a distinct genetic control.

Genes in PAR1 and PAR2

While the PARs have distinct genetic control as compared to autosomes, wide differences also exist between the two PARs in terms of their genetic content and their functions. Together the PARs contain at least 29 genes, with diverse roles in cell signalling, transcriptional regulation and mitochondrial function [16]. PAR1 contains genes, all of which are known to escape X inactivation (Fig. 1). Interestingly, defects in genes of the PAR1 are associated with mental and stature disorders. Loss of short stature homeobox [SHOX] gene in the PAR1 is related to short stature of Turner syndrome [17]; mutation in SHOX are reported in patients with idiopathic growth retardation [18]. PAR1 loci are also reported to be associated with schizophrenia and bipolar disorder [19, 20].

PAR2 on the other hand is a much shorter region, spanning only 320 kilobase pairs [kb]. Unlike in the PAR1, crossovers in PAR2 occur at a rate similar to the genome average, in both sexes, suggesting behaviour similar to many autosomal regions. PAR is not necessary for fertility [21]. Unlike PAR1 which is gene rich, the PAR2 region contains only five genes of which two, HSPRY3 and SYBL1, are inactivated on Y chromosome and subjected to X inactivation in females [22]. Additional file 1: Table S2. lists the different 226 genes on the Y chromosome, their cellular expression 227 in testis, putative function and role in spermatogenesis.

Beyond the PAR1 and PAR2, reports also suggest the existence of a 3.5 Mb region termed as PAR3. This region is said to have originated at Xq21.3 when a 3.5 Mb region of the X chromosome underwent duplication and transposition on the Y chromosome at Yp11.2 approximately 5–6 million years ago [23]. It is reported that Yp11.2 and Xq21.3 have 98.78% identity and a high concentration of tandem repeats. Interestingly, allelic unequal recombination also occurs between the two X transposed regions [24]. This PAR3 region is however found only in 2% of the general population and the functional significance (if any) of this PAR is unknown.

The non-recombining region of Y (NRY)

NRY is defined as the locus beyond the PARs which does not undergo recombination during meiosis due to lack of homology with the X chromosome. Cytogenetically this region is divided into two regions viz. the heterochromatic and the euchromatic regions. The heterochromatic region of the Y chromosome comprises distal Yq that contains two highly repetitive sequences families, DYZ1 and DYZ2 [25]. Variation in the size of the Y chromosome long arm heterochromatin within an individual have been reported [26] however its clinical significance remains unknown.

The euchromatic region encompasses the para-centromeric region and the short and long arm of the Y. This region is also referred to as male specific region on Y (MSY) and was for a long time thought to be a functional wasteland. However research in the last two decades have revealed important roles of the euchromatic region ranging from sex determination to regulation of brain functions. Discussed below is our current understanding of the human MSY and alterations in this region that affect human health.

Euchromatic region of Y

The euchromatic region of the Y lies distal to the PAR1 and consists of the short arm para-centromeric region, the centromere and the long arm para-centromeric region (Fig. 1). This region contains sequences that are subdivided into three discrete classes: X-transposed, X-degenerate and ampliconic [27]. The X-transposed sequences are so named, because of a massive X to Y transposition that occurred about 3–4 million years ago. Most of these sequences are composed of repeat elements such as Alu, retroviral and Long interspersed nuclear elements [LINE1]. Some of the genes belonging to this region have ubiquitous tissue expression; the ampliconic sequences contain genes and transcription units that are expressed solely in the testes [28]. The protein products of the MSY genes, contribute to gonad formation, regulation of spermatogenesis, brain, heart, and kidney development [29, 30] suggesting its critical functions in tissue development and its adult functions. Approximately 70 genes have been identified on the Y chromosome and described below are some of these genes. Additional file 1: Table S2 enlists the different genes on the Y chromosome along with their expression, function and role in spermatogenesis.

Genes on the short arm of the Y chromosome [Yp]

Sex determining region on Y [SRY]

In the year 1959 two scientific reports on the Klinefelter syndrome and on the Turner syndrome [31, 32], described for the very first time that the human Y chromosome contained at least one sex-determining gene that was responsible for the maleness of the embryo. A large numbers of sex reversed patients were subsequently identified to have deletions in portions of the Yp (XY sex reversal) or had additional portions of the Yp (XX males). These patients immensely contributed to the discovery of the SRY gene which was responsible for testis determination during embryogenesis. In 1990, the gene responsible for testicular determination, SRY (Sex-determining Region on the Y chromosome), was identified [33, 34] and was found to be located on the short arm of the Y chromosome close to the pseudoautosomal boundary. This gene is thought to have been evolved by a mutation in the SOX3 gene. The human SRY is a single exon that encodes a protein of 204 amino acids which contains a conserved DNA-binding domain. SRY is essential for initiating testis development and differentiation of the bi-potential gonad into Sertoli cells, which then support differentiation and development of the male germline. Hence this gene has been proposed to be the master gene regulating the cascade of testis determination [35]. Mutations in the SRY gene are identified in approximately 15% 46, XY females (Swyer syndrome); translocation of the SRY gene to the X chromosome is reported in a subset of 46, XX males [36]. Beyond its expression in developing testis, SRY is reportedly expressed in adult testis and even ejaculated spermatozoa [37] the functional significance of which is yet unclear. In addition SRY is also expressed in other somatic tissues such as adipose, oesophagus, thymus, adrenal glands, brain kidneys and also in some cancer cell lines [28, 29] suggesting its functions beyond sex determination.

Zinc Finger protein, Y linked [ZFY]

Another gene on the p arm of the Y chromosome is ZFY which encodes a zinc finger-containing protein and functions as a transcription factor. Expressed in almost all somatic tissues [28, 29], it is proposed to play a role in spermatogenesis, particularly in promoting meiotic division and sperm formation [38,39,40,41] Refer Additional file 1: Table S2. While mice knockout for Zfy genes are infertile [42], despite its expression in multiple tissues, a rare deletion of ZFY and SRY in a woman was not associated with Turner syndrome stigmata. So far, no mutations of ZFY have been reported, indicating that ZFY may not have any critical somatic functions [43].

Amelogenin, Y linked [AMELY]

The AMELY, encodes a member of the amelogenin family of extracellular matrix proteins which are involved in biomineralization during tooth enamel development. This gene has a paralogoue on the X chromosome, AMELX and a mutation in this gene leads to amelogenesis imperfecta. AMELY is expressed at only 10% of the level of AMELX and the amelogenin paralogoues, there exist men with deletion of AMELY, but have no apparent phenotype [44], suggesting that absence of AMELY has no major deleterious effects.

Transducin Beta-like 1Y [TBL1Y]

The Transducin Beta-like 1Y (TBL1Y), is a Y-linked homologue of TBL1X that is related with X-linked late-onset sensorineural deafness. TBLR1, a homologue of TBL1Y, and TBL1X act as a co-repressor/co-activator for several nuclear receptors and transcription factors. A recent study [45] has demonstrated differential expression of the TBL1Y during cardiac differentiation of human embryonic stem cells. Interestingly it was noted that the TBL1Y protein showed a significant increase during differentiation while the expression level of TBL1X simultaneously decreased. When the cellular levels of TBL1Y decreased, the authors observed reduced rates of cardiac differentiation as well as an increase in the probability of impaired contractions suggesting that TBL1Y knockdown may have negatively impacted cardiogenesis. Another study [46] has reported that the TBL1Y(A)-USP9Y(A) haplotype of the Y chromosome, present only in black people of African origin, contributed to a favourable lipoprotein pattern that most likely contributed to their reduced susceptibility to coronary heart disease. The role of TBL1Y in testis [if any] remains unknown.

Protocadherin 11, Y linked [PCDH11Y]

Another gene of the p arm of the human Y chromosome is, PCDH11Y. This gene has an homologue on the PCDH11X; is a protocadherin and expressed in multiple tissues including the testis and brain [28, 29, 47]. It is proposed to play a role in cell-cell recognition during brain development and establishment of cerebral asymmetries in humans [48]. Deletions of both the X and Y PCDH11 are associated with language delay [49].

Testis-specific protein Y-linked [TSPY]

The Yp has an array of genes encoding the testis-specific protein Y linked (TSPY) which also has an X-homologue, TSPX. These proteins function as a proto-oncogene and a tumour suppressor respectively and are also cell cycle regulators [50]. Expressed in a variety of tissue including cancers, TSPY is a candidate gene for gonadoblastoma and variations in TSPY genes are associated with compromised spermatogenesis [51, 52] although a recent study has refuted this view [53]. For more information on the genes on the p arm of the Y chromosome refer to Additional file 1: Table S2.

Genes on the long arm of the Y chromosome [Yq]

While the short arm of the Y chromosome was yet assumed to have some transcribing genes, the long arm of the Y chromosome was believed to be genetically inert. This myth was disproved in 1997 when 12 novel genes or gene families with ten full-length complementary DNA sequence were identified in the human testis that were localized to the Yq [54]. Since this discovery, the Y chromosome has been well annotated in different species and several functional genes identified [7]. Based on the expression pattern, these genes fall in two categories viz., the housekeeping genes which have X homologoues that escape X inactivation; the second group, consisting of the gene families expressed specifically in testes. The genes perform an array of regulatory functions and belong to diverse clasess such as histone lysine demethylases (KDM5D and UTY); the transcription factor (ZFY), spliceosomal component (RBMY); translation initiation factors (DDX3Y and EIF1AY); and the deubiquitinase (USP9Y) suggesting that the genes can govern expression of targets throughout the genome (Fig. 1).

Genes and genetics of human Yq

Although the first demonstration of the functional capacity of the NRY of the human Y chromosome was in 1997, the involvement of this locus with male infertility was first made almost four decades ago. In 1976 Italian researchers, identified microscopically detectable deletions at the distal end of band q11 of the Y chromosome in six out of 1170 infertile males [55]. Analysis indicated that the fathers of two of the same six males had undeleted Y chromosomes, indicating that the deletions arose de novo and could be the underlying aetiology of their azoospermia. Based on these findings, the authors proposed the existence of a spermatogenesis factor, called the “azoospermia factor” [AZF] in the Yq locus. With the advent of the physical and molecular map of the human Y chromosome [56, 57] the studies in the AZF locus and male infertility were extensively propelled. Using a panel of molecular markers, a series of subfertile men who had deletions in the Yq were identified [58,59,60]. Based on the deletion analysis of these men three recurrently deleted non-overlapping sub-regions in proximal, middle, and distal Yq11 were defined and designated “AZFa,” “AZFb,” and “AZFc,” respectively [61]. While the DAZ (Deleted in Azoospermia) gene was considered as a strong candidate for male infertility [60] little was known about the existence of other genes within this locus. With the availability of the first complete sequence of AZFc locus [62] and later the detailed structure of the MSY including the AZFa, b and c regions [27] a large number of genes were identified.

Our understanding of the roles of these genes have mainly stemmed from the genetic analysis of oligozoospermic and azoospermic men where the presence and absence of these genes within the AZF a, b and c loci has been studied. Many of these studies are heterogeneous in nature and would have included men with compromised or absent spermatogenesis (mainly non obstructive oligozoospermia and its forms and azoospermia). For the sake of simplicity we have often used the term “infertile men” to describe these men which essentially mean that these studies would have been done in either azoospermic or oligozoospermic subjects or both.

AZFa locus, its genes and its deletions

The AZFa encodes only single copy genes and is exclusively constituted by single-copy, ubiquitously expressed genes with X homologues that escape inactivation. Four genes have been mapped to AZFa.

Ubiquitin specific peptidase 9, Y linked [USP9Y]

Earlier known as known as DFFRY or Drosophila fat facets related Y, USP9Y was the first gene identified in the AZFa sub-region with a length of 170 k base pairs [kb], consisting of 46 exons [63]. This gene encodes a large polypeptide of 2555 amino acids that is approximately 300 kilo Dalton [kDa] in size and belongs to the C19 cysteine peptidase family with protease activity specific to ubiquitin. USP9Y regulates the protein turnover by preventing degradation of proteins by the proteasome through the removal of ubiquitin from protein–ubiquitin conjugates [64] and also stabilizes the de-ubiquitinated target proteins, thus playing an important role in the development of germ cells in males [65]. USP9Y is ubiquitously expressed in adult and embryonic tissues and shares 91% identity with its X-homologue, USP9X, which escapes X-inactivation and is also expressed in many tissues. Deletion of Usp9x in the mouse causes sterility due to block in spermatogenesis during meiosis [66].

The USPY is one of the candidate genes of the AZFa as there exist infertile men with deletion or mutation in USPY [64, 67]. However, refuting this view, evidences are also available describing deletions of USP9Y in men with normal sperm count [63, 68] suggesting that USP9Y is not essential for normal sperm production and fertility in males.

Besides its role in spermatogenesis, studies have discovered that a nine-residue peptide derived from USP9Y represents a minor histocompatibility antigen [H-Y antigen] involved in graft rejection [69]. Recently, an association of gene fusion involving USP9Y (TTTY15-USP9Y) with prostate cancer has also been reported [70] suggesting its functions beyond regulation of sperm production.

DEAD [asp-glu-ala-asp] box RNA helicases, Box 3, Y-linked [DBY]

DBY, also known as DDX3Y, was first identified by Lahn and Page [54] at the 5C region of the long arm of the Y chromosome at cytogenetic location Yq11.21 [53, 71]. DBY, which extends for 15.5 kb, consists of 17 exons and encodes a conserved ATP-dependent DEAD-box RNA helicase that is expressed only in germ cells with an alleged function at G1–S phase of the cell cycle [72]. DBY has a homologue on the X chromosome, DBX [53] and both of these genes have been reported to have > 95% sequence similarity while being expressed in two different stages of male germ cell line. While DBY protein expression is limited to only pre-meiotic male germ cells, the DBX protein is expressed in post-meiotic spermatids and in multiple somatic tissues [71].

Deletion analysis for the AZFa region in infertile males has revealed that males lacking DBY exhibit either Sertoli Cell Only syndrome [SCOS] or severe hypospermatogenesis suggesting that DBY plays a key role in the spermatogenic process [73, 74]. DBY protein has been reported to control the translation initiation of cyclin E1, which is essential for cell cycle progression from the G1 to S phase [75]. In drosophila, the DBY homologue, Belle, is essential for mitotic progression and survival of germline stem cells and spermatogonia [76]. Developmentally, DBY expression is initiated in the germ cells of human testis by 17 weeks of gestation implying that this protein may play a role in early spermatogonial proliferation [72]. This also suggests that, in males with AZFa deletion; germ cell depletion may begin prenatally. Indeed, complementation with DBY on the AZFa background, improved the formation of germ cell like cells from induced pluoripotent stem cells with AZFa deletion [77] suggesting that DBY functions in the earliest stages of human germ cell development.

Ubiquitously transcribed tetratricopeptide repeat containing, Y linked [UTY]

UTY and Ubiquitously transcribed tetratricopeptidete repeat, X [UTX] genes are members of the tetratricopeptide repeat [TPR] protein family which occur in proteins that control mitosis. The UTY gene is located at the 5C band of AZFa contains 50 exons and a 3’UTR region with several polyadenylation signals [78]. Two UTY transcripts are detected in various human tissues such as spleen, thymus, prostate, testis, intestine, colon, and in cells such as leukocytes, but is not transcribed in ovary [53].

The UTY encodes a male-specific histone demethylase that catalyzes trimethylated ‘Lys-27’ [H3K27me3] demethylation in histone H3 of DNA. UTY is also involved in protein-protein interactions and may act as a chaperone [79]. Reports also suggest that the UTY protein is a minor histocompatibility antigen that may induce graft rejection of stem cell grafts of males [80]. A recent study [81] has suggested that the UTY is involved in a transcriptional regulatory network that is essential for prostate differentiation and that disruption of this network predisposes males to prostate cancer. Another study [82] reported the occurrence of UTY copy number variation in males afflicted with urothelial bladder cancer. The role of UTY in testis is not clear. Several missense mutations in UTY are reported in dbSNP database and computational analysis of some of these, have been shown to be deleterious [80]. However the fertility status of these men is unknown.

Thymosin beta 4 Y linked, [TB4Y]

TB4Y, has been mapped to region 5D on the long arm of the human Y chromosome. TB4Y is expressed in various tissues, exists in a single copy and showing about 93% sequence similarity to its X chromosome homolog TB4X [53]. TB4Y encodes a novel human leukocyte antigen [HLA]-A*3303-restricted, minor histocompatibility antigen [83] and is a key activator of natural killer cell cytotoxicity [84]. The involvement of TBY4 in testicular functions is unknown.

AZFb locus, its genes and its deletions

The AZFb locus is located in the central region of Yq11 [intervals 5M-6B] and spans 3.2 Mb of which 1.5 Mb overlaps with AZFc (Figs. 1 and 2). The AZFb region has a complex genomic structure and contains three single-copy regions, a Y chromosome specific repeated DNA family [DYZ] 19 satellite repeat array and 14 multi-copy sequence units called amplicons (Fig. 2). These amplicons are organized into six sequence families, with intra-family homology levels > 99%. Amplicon families are defined by a specific colour code [yellow, blue, turquoise, green, red or grey] and each family member is identified by a numeral (Fig. 2). Of the 14 amplicon units, seven [yel3, yel4, b5, b6, b1, t1, t2] are restricted to AZFb, while the remaining are shared with AZFc. Amplicons are also categorized by a higher-order structural organization based on symmetrical arrays of contiguous repeat units called palindromes. AZFb contains palindromes P2 to P5 and the proximal part of P1 (Fig. 2). The presence of extensive ampliconic domains in AZFb allows for very complex rearrangements [62]. The critical AZFb interval necessary for spermatocyte maturation stretches from the center of palindrome P5 to the proximal edge of P3 within the RBMY1 cluster, an interval of over 4 Mb containing 13 coding genes [85].

Fig. 2
figure 2

Schematic organization of the AZFb and c loci depicting how the various microdeletions arise. The AZFb and c regions are located in the euchromatic region on the Yq. Both regions share a number of genes [pink box], the genes present in the AZFb region are shown in the green box while the genes present in the AZFc region are present in the blue box. The grey arrows depict the orientation of the genes and the grey bars depict the organisation of the amplicons into palindromes [P1 to P5]. The AZFb and AZFc loci are composed of numerous stretches of ampliconic sequences [block arrows] which are annotated as six colour-coded sequence families (yellow, blue, turquoise, green, red and grey) called amplicons. The size and orientation of the coloured arrows represents the length and orientation of the arrows. AZFb is defined by the P5/proximal P1 deletion (yel3/yel1) which removes 6.23 Mb of DNA and AZFc by the b2/b4 deletion which removes 3.5 Mb of DNA. The partial AZFc deletions b1/b3, b2/b3 and the three variations of the gr/gr deletions [g1/g2], [r1/r3] and [r2/r4] [in dotted box] remove almost half of the AZFc gene content. The shaded block depicts the exact location of the deletion. The information of the map is adapted from published data ([6, 27], and [62])

The AZFb is prone to NAHR [non-allelic homologous recombination] with AZFc, resulting in two frequent deletions of, 6.23 and 7.7 Mb. The 6.23 Mb, complete or classic AZFb deletion [P5/proximal P1] corresponds to the interval encompassed between amplicons yel3 and yel1 (Fig. 2) and occurs mainly due to unequal crossing over between homologous segments at their extremities [86]. This classical deletion overlaps within 1.5 Mb of the proximal portion of AZFc and leads to the loss of at least 32 coding genes and transcripts [87, 88]. Deletions of both AZFb and AZFc together occur in two breakpoints between P4/distal P1 (7.0 Mb, 38 gene copies removed) or between P5/distal P1 (7.7 Mb and 42 gene copies removed) [89]. The AZFb gene contains a total of five different single-copy transcription units as detailed in Fig. 2. However, not much is known about the biological functions of many of these genes.

Chromosome Y open reading frame 15 [CYorf15]

CYorf15Aand CYorf15B are single copy genes in AZFb which have an X homologue, CXorf15, thought to be involved with the taxilin family of proteins which are implicated in intracellular vesicle traffic [6].

Ribosomal protein S4, Y linked [RPS4Y2]

Eukaryotic ribosomal protein S4 (S4e) is X-linked in mammals [90] and a Y linked homologue (RPS4Y1) is present in all primate lineages [91]. In humans, a second copy of the Y-linked gene (RPS4Y2) was described [27] which originated by duplication before the radiation of Old World Monkeys, approximately 35 million years ago [91]. The RPS4Y1 is expressed in the testis and prostrate and is more highly expressed during spermatogenesis. It encodes a structurally conserved ribosomal protein subunit required for mRNA binding to the ribosome [92] and plays a role in the post-transcriptional regulation of the spermatogenic process.

Eukaryotic Translation Initiation Factor 1A, Y linked [EIF1AY]

This gene is ubiquitously expressed and is a Y-linked member of the EIF-1A family—a family involved in translation initiation. The EIF-1A proteins enhance ribosome dissociation into subunits and stabilize the binding of the 43S complex to the end of capped RNA during protein biosynthesis. Studies implicate this gene with ischemic stroke [93, 94]. Widely expressed in multiple tissues, the biological roles of this gene are unknown. However, a close homologue of this gene in the mouse (Eif2s3y) along with SRY is sufficient to induce testicular differentiation and initiate spermatogenesis until the round spermatid stage in XX mice [95] suggesting its role in spermatogenesis.

Lysine Demethylase 5D [KDM5D] / Selected Mouse cDNA, Y [SMCY]

This gene has several alternative names such as Jumonji At-Rich Interactive Domain 1D [JARID1D], Histocompatibility Y Antigen [HY] and H-Y Antigen [HYA]. It is thought that KDM5D plays a crucial role in chromosome condensation during meiosis by demethylating di- and tri-methylated H3K4 thus explaining the maturation arrest observed at the spermatocyte stage associated to AZFb deletions. The KDM5D enzyme is also known to form a protein complex with the MutS protein homolog 5 [MSH5] DNA repair factor during spermatogenesis which can be found on condensed DNA during the leptotene/zygotene stage, suggesting an involvement in male germ cell chromatin remodelling. Despite the apparently male germline-specific functions, this gene is ubiquitously expressed and is homologous to KDM5C, an X-borne gene. KDM5D has also been reported to have a tumour suppressor function in prostate cancer [96]. This gene regulates invasion-associated genes and the loss of KDM5D causes the cell to acquire invasiveness leading to the development of metastasis. A recent study [97] provides evidence that KDM5D plays an important role in determining docetaxel sensitivity, which is used in treating prostate cancer, by interacting with androgen receptor signalling and that its expression level is associated with clinical outcomes.

X linked Kell Blood group precursor, Y linked [XKRY]

The XKRY exists in two copies in the yellow amplicon of AZFb (Fig. 2) that encodes a protein which is similar to XK, a putative membrane transport protein. By analysis of a panel of partial Y chromosomes this gene was mapped to region 5 L on the long arm of the human Y chromosome [53]. The functions of this protein is unknown, no role for it has been ascribed in spermatogenesis.

Heat Shock transcription factor, Y linked [HSFY]

This gene maps to the blue amplicons with the two active copies located in b5 and b6 in the AZFb locus (Fig. 2). In the AZFb locus HSFY exists as two coding copies HSFY1 and HSFY2. Although HSFY shows homology to the heat shock transcription factor-type (HSF) DNA-binding domain, it does not bind to heat shock elements and no HSFY-targeted promoters have been identified during spermatogenesis [98, 99]. HSFY has homologues at Xq28, HSFX1 and HSFX2 [27] but HSFY is expressed exclusively in the testis and principal cells of the epididymis.

That HSFY is functional and required for spermatogenesis is evident from the observations in four infertile males with a large 768 kb deletion around the P4 palindrome at the proximal end of the AZFb interval that resulted in loss of both HSFY1 and HSFY2, and six non-coding transcription units FAM41AY2, NCRNA00230B, TTTY9A, TTTY9B, NCRNA00185 and TTTY14 [98]. Another study [99] also reported a partial AZFb deletion that removed the HSFY genes and three other genes [KDM5D, CYorf15A and CYorf15B] ostensibly only affecting the functional copies of HSFY in an azoospermic man.

The observations in genetic deletions are also backed by functional studies. HSFY is expressed in nuclei of germ cells with predominant expression in round spermatids. HSFY protein levels are decreased in testis of men with maturation arrest, associating this gene to the regulation of spermatogenesis [100,101,102,103,104].

PTPN13-like Y linked, [PTPBL]-Related gene on Y [PRY]

PRY is a testis specific gene that encodes a protein similar to protein tyrosine phosphatase, non-receptor type 13 [105]. Two nearly identical copies of this gene, PRY and PRY2, map to the blue amplicons of the AZFb region (Fig. 2), with the two functional units being restricted to b1 and b2 [106].

The expression of PRY in germ cells is heterogeneous, with the protein being detected only in a few sperm and spermatids. Furthermore, PRY levels are increased in ejaculated sperm obtained from males with abnormal semen parameters, suggesting a link between its expression and defective spermatogenesis [107]. The PRY genes are thought to be involved in the regulation of apoptosis implicated in the removal of abnormal sperm [107]. Deletions that include the PRY1 and PRY2 genes have also been reported to cause meiotic arrest [108]. Studies reveal that in cases where all the genes in the AZFb region, excluding RBMY and PRY are deleted, there is hypospermatogenesis however if both RBMY and PRY are deleted, spermatogenesis is arrested completely [109]. This indicates that these two genes are the major genes involved in fertility [108, 110].

Ribonucleic Acid [RNA]-Binding Motif, Y linked [RBMY]

The RBMY is one of the most important genes of the AZFb region with approximately six copies of this gene being dispersed within the Y chromosome [111]. The RBMY1A1 gene family was identified [112] as a multicopy gene family designated YRRM (Y chromosome RNA recognition motif) and the first candidate azoospermia facto (Fig. 2). The proteins of the RBM gene family are characterised by the presence of an N-terminal RNA recognition motif [RRM] responsible for its interaction with target RNA molecules. Contrasting with the other RBM genes, RBMY1A1 contains a C-terminal protein interaction repeat domain enriched in serine, arginine, glycine, and tyrosine [SRGY]. This serves as a probable regulatory region for the modulation of RBMY1A1 function [113].

RBMY1A1 is involved in several aspects of meiotic and pre-meiotic regulation via the establishment of multiple protein-protein and protein-RNA complexes. RBMY1 encodes a testis-specific RNA binding protein that is expressed in the nuclei of spermatogonia, spermatocytes and round spermatids [114,115,116,117,118] and the expression of the protein is reduced in testis of men with AZFb deletions [117]. Interestingly, the subcellular distribution of RBMY differs in spermatogenic cells as they progress through meiosis [115], Fig. 3. In the spermatogonia, RBMY is localized as two foci, with one in the nucleolus and the other in the sub nuclear region. However in the spermatocytes, RBMY is distributed in a punctuate manner in the nucleus but the subnuclear foci is retained. In the pachytene cells, RBMY is spread along the length of the condensing chromosomes. Interestingly, in the round and elongating spermatids RBMY is excluded from the nucleus and restricted to the cytoplasm while it is retained in the mid-piece of ejaculated sperm (Fig. 3). These observations suggest that RBMY must have diverse functional roles during different stages of spermatogenesis. Indeed analysis of the human testicular RBMY bound transcriptome have led to identification of 20 target genes some of which are testis specific and have diverse cellular functions and is proposed to regulate alternate splicing during the course of spermatogenesis [119].

Fig. 3
figure 3

Expression of RBMY during human spermatogenesis. Human testicular cells were separated by mild collagenase digestion, smeared on slides and fixed in acetone. The cells were probed using an antibody against human RBMY (Santacruz Biotechnology Inc., sc – 14,572, USA) and detected using a FITC labeled secondary antibody. The cells were imaged under a fluorescent microscope and different stages were identified based on the cell and nuclear size. Green staining represents RBMY, red is nuclei. Bar represents 20 μm. For details of the methods see Abid et al. [115]

Molecular analysis of infertile men have has indicated a positive correlation between number of RBMY1 copies and sperm count and motility [120]. Deletion of RBMY1 copies leads to decrease in sperm count; however the persistence of the two proximal RMBY1 copies is sufficient to avoid spermatogenic failure despite the total absence of the PRY gene [121]. These reports emphasized that the presence of RMBY1 was sufficient for spermatogenesis to remain qualitatively complete although quantitatively reduced. Interestingly, unlike the humans, mice knockout of Rbmy undergo normal spermatogenesis but have abnormal spermatozoa [122] indicating that atleast in rodents Rbmy is dispensable for spermatogenesis but might be required in spermiogenesis. Indeed, even in humans, RBMY protein is detected in elongating spermatids and also in ejaculated sperm [115], Fig. 3. Sperm with high motility carry more RBMY1 protein than those with relatively low motility [120]; inhibition of RBMY using a functionally neutralizing antibody inhibits sperm motility in vitro [115].

Beyond the testis, RBMY has been shown to play a role in liver cancers where RBMY was expressed exclusively in the testis of 36% of cases with hepatocellular carcinoma (HCCs), in 67% cases with hepatoblastoma and also in a liver cancer cell line [123, 124]. Gain of functions of RBMY in mouse fibroblast or in vivo in the liver results in vivo tumor formation [123, 124] downregulating RBMY in liver cancer cells reduces their tumorigenic potential [124]. RBMY is reportedly increased mainly in the hepatic stem cells of patients with HCC where it aberrantly signals GSK3β-WNT-β catenin signaling complex resulting in cell proliferation [124]. These observations imply that beyond the testis RBMY may act as an oncogene and this might explain the male predominance of various cancers including HCC.

AZFc locus, its genes and its deletions

The AZFc is located at the distal part of deletion interval 6 (subintervals 6C–6E) on the Y chromosome (Figs. 1, 2). While the AZFa and AZFb regions are essential in initiating spermatogenesis, the AZFc region is essential to complete the process of spermatogenesis. The AZFc is the most commonly deleted region of the AZF locus in infertile men [27, 125].

The AZFc region spans 4.5 Mb and codes for 21 candidate genes and 11 families of transcription units (Fig. 2) that are exclusively expressed in the testis [100, 126]. This region contains six distinct families of amplicons; turquoise, grey and yellow, which occur twice in the AZFc locus, the green amplicon which occurs thrice and the blue and red amplicons which occur four times in the locus (Fig. 2). The members of each amplicon family are nearly identical and are arranged to form six large inverted repeats and three large direct repeats (Fig. 2). Three of the six inverted repeats are palindromes, or near palindromes, with large, inverted duplications containing much shorter intervening sequences. In the AZFc region, six distinct inverted repeat amplicon families are arranged in a complex repetitive pattern forming three palindromes that are believed to have formed as a result of the duplications and tandem inversions during evolution [62]. Together, palindromes P1, P2 and P3 encompass 4.0 of the 4.5 Mb of sequence (Fig. 2). Palindrome P1 is 1.5 Mb long with a span of 3 Mb and arm-to-arm identity of 99.97%. Within the arms of P1 lie two smaller palindromes, P1.1 and P1.2, each spanning 24 kb. The three uncoloured segments u1, u2 and u3 occur once each in this region (Fig. 2). u1 shows 70–85% identity to a locus on Yp, u2 shows 70–90% identity to an interspersed Y-specific repeated locus and u3 falls within a 65-kb block of 99.7% identity to Yp. The uncoloured 2 kb segments at the centres of the P1 and P2 palindromes are identical to each other (Fig. 2). The repetitive and palindromic nature of this locus and the fact that it is bordered with a highly repetitive heterochromatic region of Yq12 make this region highly susceptible to intra-chromosomal rearrangements during meiotic recombination thus making the AZFc locus prone to deletions, duplications and copy number variations of the eight gene families that are harboured within it [62].

There are no single-copy sequences in the AZFc (Fig. 2). The AZFc region includes 12 genes and transcription units, each present in a variable number of copies making a total of 32 copies. Amongst the various transcriptional units, only active copies of four protein-coding gene families map to the AZFc interval. These include the PRY2, BPY2, DAZ and CDY1. These genes locate to the blue, green, red and yellow-coded amplicons, respectively, with one transcription unit per amplicon copy (Fig. 2).

Deleted in Azoospermia [DAZ]

DAZ originated on the Y chromosome as a transposition of a DNA segment containing the autosomal germ-cell specific gene DAZL almost 35 million years ago [87, 127]. Two autosomal DAZ homologs, BOULE and DAZL, have also been identified in humans with BOULE being considered a founding member of the DAZ gene family due to its conservation across the metazoan lineages [128, 129]. A loss of function mutation of the Drosophila homologue boule results in azoospermia thus emphasizing the role of DAZ in spermatogenesis [129].

DAZ was the first candidate gene to be isolated from the AZFc locus and was originally identified as a frequently deleted gene on the Y chromosome of infertile males [60, 130]. It was later found that the AZFc region contained palindromic duplications of DAZ as two clusters of four genes, DAZ 1 and 2 and DAZ 3 and 4 [131,132,133]. The four DAZ copies are expressed in spermatogonia, encoding an RNA-binding protein important for spermatogenesis [106] and these genes are expressed in all stages of germ cell development. Using human embryonic stem cells, it is shown that DAZ family genes function in germ cell formation and meiotic progression [134]. Therefore, all DAZ family genes are regarded critical for germ cell development.

Like the RMBY genes, the DAZ genes too, are known to contain an RNA recognition motif [RRM], expressed in the cytoplasm of pre-meiotic testicular germ cells, making these genes candidates for the maintenance of germ stem cell populations. The DAZ proteins are characterized by the presence of one or more DAZ repeat [24 amino acids rich in Asn, Tyr, and Gln residues], while in some cases, the DAZ motif is thought to mediate interaction with other proteins [135]. The human DAZ proteins carry out transportation, translational activation of developmentally regulated transcripts and their storage [136].

Deletion of DAZ accounts for 10% of cases of males with spermatogenic defects [136, 137]. Infertile males showing a loss of copies of DAZ genes are highly predisposed to azoospermia or severe oligozoospermia [138,139,140]. However due to the presence of the functional homologue (DAZLA) on human chromosome 3, a direct association between deletions in DAZ and azoospermia is difficult to conclude. Studies have also identified a single nucleotide polymorphism of DAZL that confers susceptibility to defects in spermatogenesis [141] and the deletion of the DAZ1/DAZ2 doublet has been considered responsible for severe oligozoospermia, incomplete maturation arrest and for the testicular phenotype of residual spermiogenesis [140, 142].

While DAZ gene copies are deleted in men with complete and partial AZFc deletions, we and others have reported that individual DAZ copies can get deleted or duplicated even in absence of complete deletions [138, 140, 143,144,145]. While DAZ gene copy deletions/duplications are reported even in fertile men, deletion of DAZ and CDY1 copies results in reduced sperm count and motility [140, 143]. These observations underscore the importance of DAZ genes in spermatogenesis. Functionally, the mechanism by which DAZ would regulate spermatogenesis remains unexplored, it is suspected to do so by regulation of RNA translation [146]. The targets of human DAZ gene have not been identified; human DAZ is known to interact with several other proteins that not necessarily have a role in protein translation [146] suggesting possibility of alternate mechanisms.

Chromodomain Protein Y linked [CDY]

The human Y chromosome has two identical copies of this gene within the AZFc region [CDY1A and CDY1B] and a pair of closely related genes in the palindrome P5 [CDY2A and CDY2B, Fig. 2]. The CDY1 gene encodes a protein containing an N-terminal chromatin-binding domain [chromo domain], which aids in regulation of gene expression, chromatin remodelling and encodes a histone acetyltransferase. This protein has been reported to concentrate in the round spermatid nucleus, where histone hyperacetylation occurs and causes the replacement of histones by the sperm-specific DNA packaging proteins, TNPs and PRMs [114, 147]. Although found only in primates, CDY1 is believed to have been retroposed directly into MSY from a transcript of the autosomal gene CDYL more than 150 million years ago, making it one of the oldest genes on the Y chromosome [106].

The CDY1 and DAZ families display autosomal homologues and hence some degree of functional redundancy between the Y-borne and the autosomal copies may partially account for the production of mature sperm in AZFc deleted males. One of the two copies of the CDY1 gene is deleted in men with partial deletion of the AZFc locus [88, 105]. Based on analyses of gene copy deletions in the AZFc locus of infertile men, Machev et al. [148] discriminated four types of DAZ-CDY1 partial deletions and found that only one deletion type, DAZ3/4-CDY1a, was associated with male infertility. In a similar study concomitant deletion of both DAZ and CDY1 copies in males predisposes them to azoospermia or severe oligozoospermia [140, 149] deletion of CDY1b copy alone is also reported to be associated with oligo/azoospermia [150]. However, the fact that some men with CDY1 deletions (alone or in combination with DAZ) are fertile/normozoopermic suggest that these genes are not indispensible for spermatogenesis [140, 151]. Beyond its function as a histone acetyltransferase, little is known about the molecular functions of CDY1.

Basic Protein Y linked, 2 [BPY2]

The BPY2 gene is expressed specifically in testis and its protein product is involved in male germ cell development. Three nearly identical copies of this gene exist on Y chromosome, BPY2A, BPY2B and BPY2C, of which two copies of are located at the boundaries of the gr/gr deletion, flanking the DAZ gene clusters (Fig. 2). BPY2 is localized in the nuclei of spermatocytes, round spermatids and spermatogonia [152]. The BPY2 gene encodes for a small positively charged protein which is thought to be involved in cytoskeletal regulation in spermatogenesis. Due to its small size and high charge, it is thought that BPY proteins may functionally interact with DNA in a manner that resembles chromatin-associated proteins such as histones and high mobility group proteins which are known to play a role in the regulation of processes such as transcription, replication, recombination and DNA repair [153]. The frequency of BPY2 copy number alterations in infertile males is reported to be significantly high in the Chinese population [138] and also in Indian population (Modi et al. unpublished data). Some genetic variants in BPY2 gene are associated with SCOs [154].

Golgi autoantigen, golgi subfamily a, 2-like, Y-linked [GOLGA2LY]

GOLGA exists as two copies viz. GOLGA2P2Y and GOLGA2P3Y on the AZFc locus, arranged in opposite orientation in palindrome P1 (Fig. 2). This gene is reported to be transcribed and expressed only in the testis where it encodes a 108 amino acid protein [27, 106, 155]. However we have failed to amplify specific GOLGA2LY transcripts in the human testis, sequence analysis of some residual bands in RT-PCR experiments have been shown to arise out of non-specific amplifications from its homologs (Modi et al. unpublished data). GOLGA2LY protein has not been reported in any of the testicular proteomes [156]. Thus it is yet unclear if GOLGA2LY is a pseudogene or is transcriptionally active.

Partial AZFc deletion involves deletion of one of the two copies of the GOLGA2LY gene (Fig. 2). Although no function has been attributed to GOLGA2LY, it is interesting to note that males harbouring GOLGA2P3Y deletion have low sperm concentration and motility compared with males without deletion or with deletion of GOLGA2P2Y, suggesting the differential roles of the two copies spermatogenesis [138, 143]. The frequency of GOLGA2P3Y deletion is significantly higher in oligozoospermic men compared with normozoospermic men (odds ratio of 9), whereas the frequency of GOLGA2P2Y deletion was comparable between oligozoospermic and normozoospermic men [157]. Furthermore, men with GOLGA2P3Y deletion have reduced sperm concentration and motility compared with men without deletion or with deletion of GOLGA2P2Y [157] suggesting that loss of GOLGA2P3Y is an independent risk factor for oligozoospermia. However, assuming that the GOLGA2LY is a pseudogene and does not transcribe/translate in the testis, how these deletion result in infertility is difficult to ascertain. Positional shifts in the AZFc locus due to such deletions might offer a possible explanation.

Chondroitin Sulfate Proteoglycan 4 Pseudogene 1, Y-linked [CSPG4P1Y]

Like the GOLGA, CSPG4P1Y the family of transcription units exist in two copies (Fig. 2) and is regarded as a pseudogene [32]. One of the two copies of the CSPG4LY gene is deleted in the b2/b3 deletion within the AZFc region [105]. Not much is known about the functions of this gene or its copies in maintenance of spermatogenesis.

Testis-Specific Transcript, Y-Linked 4 [TTY4]

The TTY4 gene has three copies (Fig. 2), TTY4A, TTY4B and TTY4C. This gene has not been studied in detail and is considered to be an RNA that does not encode any protein [88, 105]. TTY4 copy deletions are rare in Indian population but, loss of one or more copies of TTY4C have been found to be associated with male infertility (Modi et al. unpublished data). In summary, it appears that the Y chromosome harbours genes that not only play integral roles in spermatogenesis but also in several aspects of human health. Several Y encoded genes are expressed not only in the testis but also in tissues involved in immune functions. [Refer Additional file 1: Table S2] However the putative functions of several Y borne genes still remains unknown and future studies may provide answers towards this end.

Y chromosome microdeletions and male infertility

The human Y chromosome is genetically dynamic and is also prone to significant variation owing to the high proportion of segmental duplications which form the basis of the wide variety of deletions and duplications seen in various loci of this chromosome. Since the Yq locus contains large numbers of genes that are transcribed in the testis and have well defined role in spermatogenesis loss of these regions would cause infertility. Clinically, the Y chromosome alterations can be classified as (1) AZF deletions (complete loss of one or more of the AZF loci), (2) partial AZFc deletions and duplications and, (3) the gene copy number variations (CNVs). Described below are the prevalence and association of these deletions with male infertility.

Yq microdeletions

Y chromosome microdeletions are small submicroscopic segmental deletions in the proximal Yq that remove the entire or parts of AZF region (complete deletions). Based on extensive deletion mapping studies in infertile men, five different deletions patterns on Yq have been reported [88]. However, in clinical practice, these are clubbed and termed as AZFa, AZFb, and AZFc deletions (Fig. 2). These deletions are one of the leading causes of spermatogenic failure and hence the screening for AZF deletions has become part of the routine diagnostic work-up of infertile men [158,159,160]. With the availability of the specific markers, simple PCR strategies to study the AZF locus are now available from various parts of the world that have shown the prevalence of Yq microdeletions. With a few exceptions of early reports, with refinements in the technologies, Yq microdeletions are detected exclusively in men with abnormal spermiograms and not been observed in large series of fertile men [158,159,160] suggesting that these deletions are the cause of failure of spermatogenesis and hence infertility. However, there are case reports where fathers of infertile men have been shown to carry these Yq deletions [161, 162] and this casts doubt on the whether these deletions themselves are sufficient to cause infertility or additional defects in the genome would be required to manifest the phenotype.

Based on global data, Yq microdeletions are estimated to occur in about 1: 4000 men in the general population, but its frequency in infertile men is about 1:12. Of the > 30,000 Y chromosomes analysed for AZF microdeletions (by STS-PCR method), the global prevalence of AZF microdeletions in infertile men is estimated to be 7% (95% CL 6.74–6.79), (Fig. 4). As evident there is a wide variation in the frequency of Yq microdeletions in different parts of the world (Fig. 4), this could reflect underlying differences in sample size, methodology used and the population screened. The country wise published data of Yq deletion in infertile men is provided in Additional file 1: Table S1. However; pooled estimates based on geographic locations suggest that the lowest prevalence of Yq microdeletions is in Europe (3% 95% CL 2.9–3.0) and Australia (5.3% 95% CL 5.9–7.8); while the rest of the world has an average of 8–9% (Fig. 5). What makes the Europeans and Australian infertile men less susceptible to Yq microdeletions warrants investigation. Amongst the Asians we observed that the highest prevalence of Yq microdeletions is amongst the East and South East Asians and lowest amongst South Asians (Fig. 5). This indicates that the susceptibility of the Yq to undergo microdeletions is perhaps race or ethnicity dependent. An influence of the Y chromosome genetic background (e.g. haplogroups) has been suggested to influence the susceptibility of partial AZFc deletions (see below). Whether these haplogroups also contribute to the occurrence of Yq microdeletions needs to be investigated. Since the global data is collated from diverse groups and not all studies have reported the ethnicity of the population investigated we are unable to determine the influence of different genetic background on prevalence of Yq microdeletions.

Fig. 4
figure 4

World map depicting the prevalence of Yq microdeletions in infertile males in different countries. The prevalence of Yq microdeletions in different countries of the world was estimated from published data of 40,127 Y chromosomes from infertile men. (oligozoospermic or azoospermic men). Only those articles published in English were considered and total number of infertile men studied and those having deletions were recorded along with the country. For each country data from different studies were pooled and the average estimated

Fig. 5
figure 5

Prevalence of Yq microdeletions in infertile men. The average prevalence of the Yq microdeletions in different continents of the world was estimated from published data of 40,127 Y chromosomes from infertile men. Infertile men could be oligozoospermic or azoospermic men. Pie chart gives distribution of Yq microdeletions in the Asian region. The numbers were estimated from the data of Asian men based on geographical. In both the cases, only those articles published in English were considered and total number of infertile men studied and those having deletions were recorded along with the country. Data from different studies from same continent were pooled and the average estimated (for details see Additional file 1: Table S1)

Irrespective of the population under study, the AZFc is most frequently deleted locus in infertile men [60–70%] followed by AZFa [0.5–4%], AZFb [1–5%] and AZFb+c [1–3%] deletion [11]. While these are global estimates, it is intriguing that the frequencies of the various AZF loci that are deleted in infertile men differ amongst various populations. While the frequency of AZFc deletion is lower in Indian population as compared to western counterparts (45% versus 60%), the frequency of AZFa deletion is almost double (11 versus 5%). Further, the frequency of double deletions (AZF a + b, b + c) is also higher in the Indian population as opposed to world literature [159]. A similar difference in the frequency of deletion of different AZF loci has been observed in Iranian population [163]. Also there are unusual combinations of deletions like the AZFa+c deletion observed in some populations but not the others [159, 164,165,166]. It is suggested that since the unusual combination of deletions are detected only by isolated markers and not confirmed by additional analyses these are perhaps methodological artefacts [158]. Nevertheless, it is important to note here that these deletions recurred in various populations and were detected using a variety of STS markers exclusively in infertile men; hence these deletion patterns should not be disregarded. Additional data will be needed to understand the molecular basis of such deletions.

Phenotypic manifestations of Yq microdeletions

AZF deletions are specific to infertile men and hence it is appropriate to consider Y deletions as a cause of oligo/azoospermia rather than a cause of ‘infertility’. Overall, 25–55% of males with extreme testicular pathologies such as hypospermatogenesis, sperm maturation arrest and SCOS and 5–25% males with severe oligozoospermia or azoospermia harbour Y chromosome microdeletions [167]. However, depending on the AZF locus deleted the phenotypic manifestations reportedly vary.

AZFa deletions

In general, deletions of the entire AZFa region invariably result in Sertoli cell only syndrome (SCOS) and azoospermia [58, 168,169,170,171,172]. Since genes within the AZFa locus are expressed in the germ cells prenatally, it is possible that the loss of these genes could lead to germ cell death developmentally leading to SCOS [72]. The partial AZFa deletions are however associated with phenotypes ranging from azoospermia to normozoospermia [173] indicating that the amount of genetic content lost is a critical determinant of azoospermia due to AZFa deletions. Hence the diagnosis of a complete deletion of the AZFa region implies the virtual impossibility to retrieve testicular spermatozoa for intracytoplasmic sperm injection (ICSI).

AZFb deletions

The genes in the AZFb locus support the growth and maturity of sperm and are considered critical for efficient progression of sperm through meiosis into spermiogenesis. Not unexpectedly, patients with deletions of the AZFb region have a testicular phonotype of maturation arrest, frequently at the spermatocyte stage with an absence of post-meiotic germ cells in the majority of the tubules [6, 86, 174]. A clear correlation between the extent of the AZFb deletion and testicular histological findings is observed. Hypospermatogenesis is frequently observed among partial AZFb/b + c deleted specimens, but more severe findings are observed in complete AZFb and AZFb+c deleted specimens indicating that the AZFb deletions do not confer a severe phenotype but cause a block in spermatogenesis. The maturation arrest phenotype associated to AZFb deletions most probably stems from a combination of genetic disruption with structural defects in the chromosome [175].

It is generally accepted that the chance to retrieve mature sperm cells either from the ejaculate or via testicular sperm extraction is negligible in cases with AZFb deletion [175,176,177] although sperm have been retrieved from testis of men harbouring AZFb deletions [178,179,180,181,182]. While these differences could reflect the heterogeneity in extent of deletion in different patients; the likelihood of finding sperm cells in males with complete versus partial AZFb deletions is significantly lower [175].

AZFc deletions

Men with AZFc deletions by far have the most variable phenotype ranging from complete azoospermia to mild oligozoospermia [126]. In general, the testis of men with AZFc deletions will have elongating spermatids and sperm can be retrieved in a reasonable numbers of AZFc deleted infertile males [166, 170]. In most men with AZFc deletion, spermatogenesis is completed, but on a reduced scale resulting in oligozoospermia [171].

By semen analysis, complete AZFc deletions are associated with drastic reduction in sperm count and most AZFc deleted males are severely oligozoospermic [158, 159, 171] however some may even be azoospermic [158, 159]. There also exist rare AZFc deleted males who have conceived multiple children naturally but all of the sons of these males have been found to be infertile [151, 161, 162].

While the AZFc deletions by themselves are less pathogenic, in prospective follow up case studies, it is observed that that in a subset of men with AZFc microdeletions there is progressive decline in sperm count and the patient progresses from oligozoospermia to its severe form or even become azoospermia [183,184,185]. This implies a temporally deteriorating effect of AZFc deletions on spermatogenesis leading to worsening in sperm numbers and quality. Thus although men with AZFc deletion may have sperm in the ejaculate, they must be offered semen cryopreservation to prevent invasive techniques like TESA in later stages of life for sperm retrieval.

Partial AZFc deletions

The AZFc region is comprised of repeated sequences and palindromes hence making it most vulnerable to deletions. A complete deletion of AZFc involves the b2/b4 region which contains 12 genes and transcriptional units in multiple copy numbers (Fig. 2). In addition to b2/b4, the AZFc locus has many partial deletions that include b1/b3 (1.6 Mb), b2/b3 (1.8 Mb) and gr/gr (1.6 Mb) [186]. These deletions remove unique portions of the AZFc; however the genes removed are almost similar (Fig. 2). Three types of gr/gr deletions are reported that are caused by homologous recombination flanking the g1/g2, r1/r3 and r2/r4 amplicons in P1 and P2 palindromes in the AZFc (Fig. 2). However all the three are referred to as gr/gr and not classified further during analysis.

None of the partial deletions completely eliminate any gene family, but reduces the copy number of gene families with the exception of b1/b3 deletion that results in the loss of the six copies of RBMY1 and both functional copies of PRY (Fig. 2). The gr/gr deletion results in loss of two of the four DAZ gene copies, one of the two CDY1 and BPY2 gene copies along with one of three copies of BPY2 and two of the four DAZ genes. The b2/b3 deletion is derived from a gr/gr or b2/b3 inversion and removes almost the same gene content as gr/gr (Fig. 2).

Prevalence of partial AZFc deletions and male infertility

The AZFc is of particular interest to both geneticist and a reproductive biologist as it harbours multicopy genes and provides an opportunity to study gene dosage effects in regulation of spermatogenesis. It is believed that the partial deletions lead to extensive copy number variations in the AZFc which might compromise the amounts of the protein produced thereby affecting spermatogenesis. This has formed the basis for analysis of the partial AZFc deletions in clinical evaluation of an infertile male. The ease of a well-designed PCR strategy has further fuelled this practice in various clinics and laboratories globally.

Since their first description, a wealth of information regarding partial AZFc deletions has accumulated in the past 15 years and it is clear that the association of partial AZFc deletions and male infertility is not as lucid as in the case of complete AZF deletions. This is because 1) there is wide heterogeneity in the types of partial AZFc deletions and 2) fertile and normozoospermic men harbour the AZFc partial deletions. Thus doubts have been cast on the involvement of AZFc partial deletions and male infertility. Three well designed meta-analysis and one study of > 20,000 Y chromosomes are published in recent times that have analysed the association of partial AZFc deletions and male infertility [126, 167, 187, 188]. Described below is the picture that has emerged out of these studies with large sample sizes.

Several controversies exist about the association of gr/gr deletion with male infertility. Early on it was reported that the gr/gr deletion was present in 3.8% of infertile males and 2.2% of fertile males (of unknown sperm counts) but absent in normozoospermic males [186] suggesting that the gr/gr deletion was associated with decreased sperm production resulting from the reduced dose of AZFc genes. However subsequent studies in many populations have yielded controversial findings where gr/gr deletions have been identified even in fertile males with normal sperm counts (see [167, 187]). While some studies have suggested that frequency of gr/gr deletion is significantly higher in infertile men as compared to fertile/normozoospermic controls; others have reported no such association. In two meta-analysis studies [167, 187] a significant association of gr/gr deletions and male infertility has however been observed. We analysed the available data [167] and observed that of the of the 10,978 Y chromosomes analysed from infertile men and 6704 from fertile/normozoospermic controls, twice the number of infertile men had gr/gr deletions as compared to controls (Fig. 6); with an OR of 1.8. Interestingly, the gr/gr deletions are common in African and Asian men (10–15%), they occur at a frequency of less than 5% in other population (Fig. 6). Furthermore, the association of gr/gr deletion with male infertility is observed in Australian, Asian and European and Australian men; the association is weak or negligible in American men and African men. The association of gr/gr deletion and male infertility is also ethnicity dependent [167, 187]. Amongst the different races, the association of gr/gr deletions and male infertility is strongest in Caucasian men and weaker in Mongolian men, but no association is observed in Dravidian and Nigro-Caucasian lineages (Fig. 6). Why the prevalence of gr/gr deletions and it association to male infertility differ with populations (geographically and ethnicity) is a matter of debate. It is suggested that such differences in susceptibility gr/gr deletions to male infertility is dependent on the Y chromosome background mainly the haplogroups [6]. A second school of thought is the high heterogeneity in the gene copy deletions in men with gr/gr deletions. It is shown that not all men with gr/gr deletions have identical amounts of DNA lost. Based on copy number estimates of DAZ and CDY1 genes, it is observed that only those men with two DAZ and one CDY1 copy deleted will have azoospermia or oligozoospermia, retention of any one of these genes will be almost always be associated with normozoospermia [139, 140, 143, 145, 149]. Along with DAZ and CDY1, studies have also identified GOLGA2LY and BPY2 copy numbers have a protective effect on fertility of men with gr/gr deletions [138, 143, 157]. Thus it appears that the gene content and the gene dosage effect in the AZFc locus is a determinant of fertility in men harbouring gr/gr deletions.

Fig. 6
figure 6

Association of gr/gr and b2/b3 deletions with male infertility. Data was obtained from previous studies [105, 126, 167, 188]. Data for gr/gr is derived out of 10,978 and 6704 Y chromosomes from infertile and fertile men respectively. For b2/b3 the data is derived out of 9981 and 5990 Y chromosomes from fertile and infertile men respectively. Infertile men could be oligozoospermic or azoospermic men. Fertile men would be normozoospermic/proven fertile men with unknown sperm counts. Data was divided based on continents or by race. * indicates value significantly different form fertile counterpart

Nevertheless, irrespective of the population, in individual studies as well as pooled estimates men with gr/gr deletions have significantly lower sperm counts and motility as compared to non-deleted controls [126, 167, 187] indicating that the gr/gr deletion might not be a direct cause of infertility, but does compromise process of spermatogenesis leading to lower sperm counts.

b2/b3 deletion

As compared to the gr/gr, the b2/b3 is a slightly a larger deletion (1.6 Mb versus 1.8 Mb) and derived as an inversion (Fig. 2). Unlike the gr/gr, the correlation of b2/b3 partial deletions with spermatogenic failure is uncertain. Analysis of 17,784 Y chromosomes from infertile men and 11,684 Y chromosomes from fertile/normozoospermic men [188], showed a poor (but statistically significant) correlation of b2/b3 deletion and male infertility (Fig. 6). Like in case of gr/gr, the prevalence and association of male infertility and b2/b3 deletion differs geographically and is based on the ethnicity of the study population. Irrespective of the fertility status, the prevalence of b2/b3 deletion is highest amongst Asians (~ 3%) and lowest in Europeans (~ 0.5%) and in both these cases there is no significant difference in its frequency in fertile and infertile men; no association of b2/b3 deletions with male infertility has been noted in Australian men (Fig. 6). Intriguingly, while gr/gr deletions is not significantly associated with male infertility in African men, the prevalence of b2/b3 deletion is higher in infertile men of African origin as compared to its normozoospermic/fertile counter parts (1.17 vs. 0.25) and this difference is significant (Fig. 6). A similar association is observed in Chilean men where b2/b3 deletions are exclusively reported in infertile men. However, the b2/b3 deletion does not seem to be associated with male infertility in American men [126].

Like, the gr/gr, the association of b2/b3 deletions and male infertility also seems to be ethnicity dependent. The b2/b3 deletion is strongly association with male infertility in Dravidian and Negro-Caucasian men, weekly in Mongolian men but not in Caucasian men (Fig. 6). This pattern seems to be complementary to gr/gr deletions. The gr/gr but not b2/b3 deletion is strongly associated with male infertility in Caucasian men, the b2/b3 but not gr/gr is associated with male infertility in Dravidian and Negro-Caucasian men. In the Mongolian population both deletion carry a risk of infertility albeit weakly (Fig. 6). These observations are interesting and suggest that irrespective of the type of deletion and the population, the loss of gene copies in AZFc increases the susceptibility of an individual to reduction in sperm count. However, why one kind of deletion is pathogenic in one population but not the other needs further investigation.

b1/b3 deletion

The b1/b3 deletion differs from the gr/gr and b2/b3 deletions because this deletion encompasses the part of AZFb and results in loss of the six copies of RBMY1 and both the copies of PRY (Fig. 6). Very few studies have reported association of b1/b3 deletion and male infertility. In an analysis of 20,000 Y chromosomes [126], the b1/b3 deletion is estimated to occur in one of every 994 men but this deletion increases the risk of spermatogenetic failure (OR 2.5). However, this deletion is very infrequent in most population and hence its association with male infertility is not known in most populations.

Copy Number Variation in Y chromosome and male infertility

A Copy Number Variation [CNV] is defined as a DNA segment longer than 1 kb with a variable copy number of genes in comparison with a reference genome [189]. CNVs produce phenotype via diverse mechanisms, including gene dosage, presence of an interrupting gene, creating a fusion gene, unmasking of recessive coding region, mutation or other functional SNPs and position effect. All the different AZF microdeletions can be considered as large CNVs because they modify the copy number of genes on the Y chromosome. Further, the unstable AZFc region is predisposed to undergo several types of rearrangements resulting in changes in the cumulative copy number of the gene families contained within it. These complete or partial AZF deletions leading to CNVs of genes within it compromise spermatogenesis [138, 140, 143, 157, 190]. In addition to these AZF loci, CNVs across the entire length of the Y chromosome have been investigated in male infertility [125, 191,192,193]. In a CNV analysis of the MSY in healthy males, large duplications encompassing the AMELY and TBL1Y genes, partial deletions of the TSPY cluster, a variation in the number of RBMY genes and duplication of two segments overlapping the AZFa region have been reported [194] indicating that the Y chromosome is highly polymorphic in terms of its genetic content in general population. To understand the association of gene CNVs and male infertility, 68 well-defined histopathologically confirmed cases of males with idiopathic testicular maturation arrest were analysed by SNP arrays [103]. The results revealed PAR 1 and PAR 2 CNVs in 5 cases, PAR 3 CNVs in 19 cases and TSPY2 gene gain in 16 cases. Men with normal spermatogenesis have not been investigated in this study; it is difficult to ascertain the significance of these CNVs. However, the discovery of PAR CNVs is biologically significant as X and Y chromosome pairing is made possible by the PARs and aberrations in chromosome structure at the PARs due to these CNVs would logically disrupt meiotic synapsis resulting in gametogenic failure and can be one of the aetiological factors underlying maturation arrest. In addition, the PAR regions are rich in genes that escape X inactivation to maintain dosage compensation (Fig. 1) it is plausible that these CNVs (deletions or duplications) of PARs may result in under- or over-expression of these genes resulting in a pathological phenotype. However, with the exception of genes in AZFc, no direct studies have been done comparing the CNVs with gene expression and spermatogenesis arrest.

Clinical significance of genetic testing of Y chromosome and male infertility

Although there is universal agreement on the need for chromosome analysis by karyotyping in the workup of male infertility, there is still a lack of consensus regarding the clinical utility of testing for Y chromosome microdeletions in azoospermic and oligozoospermic males. While the American Society of Reproductive Medicine recommends the use of both, karyotyping and Y chromosome microdeletion studies, in males preparing to undergo Intracytoplasmic Sperm Injection [ICSI], the National Institute for Health and Care Excellence recommends only karyotyping for this group of patients [4]. In which patients molecular screening of the Y chromosome should be performed remains a dilemma. Based on data of several thousands of patients, Yq microdeletions are found in a high proportion of patients with azoospermia or severe oligozoospermia; the occurrence of deletion in infertile patients with sperm concentration > 5 × 106/mL is usually less [195]. However there are population specific differences [159] and in this scenario country specific guidelines are necessary. While analysis of Yq microdeletions is not indicated in patients with chromosomal abnormalities, obstructive azoospermia or hypogonadotropic hypogonadism, there are a number of examples of deletion carriers among non-idiopathic infertile men including Klinefelter syndrome and varicocele [159]. Therefore, the presence of any diagnosis accompanied by azoo- or severe oligozoospermia should be an indication for AZF testing. We recommend Yq microdeletion testing in routine clinical practice for the following reasons

  1. 1.

    Identifying the cause of infertility: The Y chromosome contains several genes required for spermatogenesis and loss of one or more of such genes can cause impairment of this process. By investigating the presence of Y chromosome microdeletions it is possible to determine the underlying genetic aetiology of male factor infertility and implement appropriate screening strategies for abnormal phenotypes. Knowledge regarding these deletions will also help clinicians to provide more effective solutions to problems faced by infertile couples. For example, low sperm count and motility can be treated with hormones, anti-oxidants and lifestyle changes to improve the seminogram. However, these strategies of treatment will fail if the cause of infertility is genetic. Also AZF screening is important before varicocoelectomy because deletion carriers will most likely not benefit from the surgical procedure. Therefore, if the male partner is detected with a deletion the couple can directly be offered assisted reproductive techniques [ART] and not subjected to medical treatments to improve sperm count and motility.

  2. 2.

    Predicting the prognosis of infertile males: While there exist numerous cases reports of AZFc deleted men can father a child, it is clinically observed that individuals harbouring Y chromosome microdeletions have progressive decline in sperm counts and can progress to azoospermia over time (discussed above). Thus males with mild or moderate oligozoospermia and Yq microdeletions would require a multiple followed-up for their possible progression to azoospermia. Thus the knowledge of the Yq status would aid in counselling these men and provide them an option of sperm cryopreservation for biological parenthood in future.

  3. 3.

    Predicting outcome of testicular sperm aspiration [TESA]: Most males with Y chromosome microdeletions would be infertile and have absence of or very few sperms in ejaculate. To achieve pregnancy, sperm can be retrieved directly from the testes using techniques like testicular sperm extraction [TESE] or testicular sperm aspiration [TESA]. The occurrence and type of Yq microdeletion has been found to correlate with testicular phenotype and chance of sperm retrieval. Hence, screening for Y chromosome microdeletions can aid to predict the success of obtaining biological parenthood before undertaking invasive procedures.

  4. 4.

    Predicting the success of ART: Most men with Yq microdeletions would require ART for biological parenthood. Although yet controversial, Studies have reported slower fertilization rate, poor embryo quality, impaired blastocyst rate and lower overall success of ART in men with AZF deletions [140, 157, 166, 195]. Hence, Yq microdeletion screening would also help in counselling couples regarding the probability of success rates after taking up ART to aid the couple in a rational decision making.

  5. 5.

    Prevention of vertical transmission of the genetic defects: With the advent of ICSI, several live births are reported in couples where the male partner has a Yq microdeletion. However, in all these cases, the Yq microdeletion is passed down to the male offspring who are also infertile, thus resulting in a 100% transmission of the genetic defect and infertility from the fathers. In some instances the partial AZFc deletion in the father has resulted in full deletion in the offspring [196] which is a definite case of spermatogenic failure. Thus Y chromosome microdeletion testing is highly recommended for all infertile males who opt for ICSI so that the couple can make an informed choice of having biological parenthood at the risk of perpetuating infertility in the family.

  6. 6.

    Risk of testicular cancers: The Y chromosome has long been a suspect candidate for gonadal cancers. In individuals with gonadal dysgenesis that bear a full or even partial fragments of Y chromosome (even in a few of the cells) have a high risk of developing gonadal tumours specifically gonadoblastoma [197,198,199,200]. In addition the Yq microdeletions and/or its CNVs are suggested to be a risk factor for the development of testicular cancer [201]. While these are limited observational studies, results of large case control studies are yet awaited. It is becoming increasingly apparent that beyond infertility, the knowledge of Yq microdeletions should also aid in predicting occurrence of cancers in men.

  7. 7.

    Neuropsychiatric disorders and Yq microdeletions: While Yq microdeletions are strongly associated with infertility, reports are now emerging that atleast a subset of men with Y chromosome defects have higher prevalence of mental disorders. In a recent analysis of 42 Chilean patients diagnosed with Yq microdeletions a significantly higher proportion of abnormal height was seen amongst patients with terminal AZFb+c deletions compared with infertile cases without microdeletions. Intriguingly, 5/42 men (11%) had some forms of neuropsychiatric disorders including two with bipolar disorder and three with severe clinical depression. Clinical histories also documented language delay, attention-deficit hyperactivity disorder (ADHD) and emotional and behavioural problems including anxiety and social disabilities in this cohort [202]. However none of the infertile patients with non-terminal Yq microdeletions had any medical history of neuropsychiatric abnormalities [202]. While further long term clinical follow-up data of men with Yq microdeletions is required; these preliminary observations do indicate the occurrence of other health risks beyond infertility in such men and this warrants testing of Y chromosome genetics in a clinical situation.

Conclusion and future directions

Male infertility is a complex multifactorial condition that presents with highly heterogeneous phenotypes. The Y chromosome plays a central role in regulation of spermatogenesis as it harbours Y-linked genes that are expressed in the testis and involved in various processes during spermatogenesis. The importance of these genes is evident from the observations that the removal of these genes causes distinct pathological testis phenotypes. After 20 years from the first molecular definition of the AZF, Yq deletion screening has now become a routine test for infertile males in many countries to identify the cause of male infertility. With clear-cut cause–effect relationship with severely impaired spermatogenesis, this test is now of help in even determining the success rates of sperm retrieval and prediction of success of assisted reproduction. Further as the deletion has 100% transmission rate to male offspring’s, the couple needs to be aware that the males in the future generation will also be infertile. Further as the gr/gr deletions are known to undergo expansion leading to full deletions [138, 196] the couple should be aware that besides the obligatory transmission of a genetic risk factor for impaired sperm production (gr/gr deletion) to the male offspring, there is a higher risk for the transmission of a complete AZFc deletion, which is a causative factor for spermatogenic failure.

Beyond these immediate applications, there are some clinically relevant issues associated with Yq deletions that need urgent attention. Presently, there is no long term follow-up data of men harbouring Yq microdeletions and there is an urgent need for data on the health status of children born from AZF deletion carriers. It is especially relevant in two conditions 1) Increased risk of testicular cancers 2) Possible occurrence of neurological dysfunctions. Due to the instable nature of the deleted Y chromosome, there is an increased risk of the expansion of deletions which might be a viewed as a “genomic instability”. Such progressive loss of genetic material during spermatogenesis might lead to sex chromosome mosaicism in germ cells which might predispose them to testicular/germ cell tumours. However, little data exist on the incidence of testicular tumours in men with Yq microdeletions especially in second generation males born to fathers carrying the deletion. Secondly, recent reports have demonstrated a significantly higher deletion load not only on the sex chromosome but also on autosomes of infertile men ([203, 204]) indicate more widespread effects of such deletions on genomic stability. Coupled with the fact that many Y linked genes are also expressed in multiple tissues; how genomic instability and disturbances in gene expression owing to Yq deletions affect general physiological functions has not been investigated the long term implications of such effects are obscure. With the recent data on higher prevalence of neurological problems in infertile men with Yq deletions it is imperative that we carry out detailed analysis of men with Yq deletions with an outlook beyond infertility. We hope that careful clinical observations coupled with detailed genetic information will provide important insights into these unanswered basic questions and give a different perspective to the field of androgenetics.