Introduction

Tetraploid upland cotton (Gossypium hirsutum) is the most widely cultivated cotton plants, which is the most important resource of natural fiber for textile industry (Zhu 2016). The genome of tetraploid cotton (G. hirsutum, AADD, 2n = 4x = 52) and its diploid ancestor cotton (Gossypium arboretum, AA, 2n = 2x = 26; Gossypium raimondii, DD, 2n = 2x = 26) had been sequenced recent years (Paterson et al. 2012; Wang et al. 2012; Li et al. 2014, 2015; Zhang et al. 2015). Benefit from the publication of the genome of upland cotton (70,478 predicted protein coding genes), a large number of functional genes and their gene family members involved in cotton fiber development have been reported (Huang et al. 2015; Guo et al. 2016; Zhang et al. 2016; Wu et al. 2017). However, in-depth digging of the genome data will be necessary.

Reactive oxygen species (ROS) can be continuously produced in all aerobic organisms to take important role as regulator for cellular response to environmental factors in plants (Alscher et al. 1997; Pandey et al. 2017). It is demonstrated that H2O2, a signaling molecule known as the major active ROS type, is involved in regulation of plant cell development and stress resistance, such as root hair initiation and elongation, drought and salinity resistance, and temperature stress response (Pei et al. 2000; Panchuk et al. 2005; Pinheiro and Chaves 2011; Qu et al. 2013). ROS could synergize or antagonize many cellular regulatory circuits through active interaction with other signals and plant hormones during growth, development, and stress responses (Petrov and Breusegem 2012; Tao et al. 2016). In plant cells, despite the vital role in cell development, ROS accumulation could also cause severe damages, accordingly, plants developed a complex antioxidant system to prevent cellular damage generated by ROS, such as ascorbic acid (AsA), glutathione (GSH), and carotenoids (Smirnoff 2000).

Ascorbate peroxidase (APX, EC 1.11.1.11) is a family of type I heme-containing peroxidase that catalyzes H2O2 to water using ascorbate as specific electron donor, functioning in maintaining cell reduction/oxidation (redox) homeostasis by scavenging ROS (Foyer and Halliwell 1976; Noctor and Foyer 1998; Sharp et al. 2003; Suzuki et al. 2012). APXs are encoded by small multigene families in higher plants and are classified into different groups according to their subcellular localization (Teixeira et al. 2004). In Arabidopsis, eight APXs have been identified with localization of three in cytosol, three in peroxisome, and two in chloroplast, respectively (Shigeoka et al. 2002; Chew et al. 2003). APX family has been fully characterized in rice and tomato based on their genomes (Teixeira et al. 2006; Najami et al. 2008).

Cotton fiber, composed of numerous non-branched single cell, is an ideal model material to investigate cell growth (Li et al. 2017). ROS promoted cell expansion or enlargement though participating in plant cell wall loosening (Cosgrove 2000; Liszkay et al. 2004). It has been demonstrated that ROS plays vital role in cotton fiber cell elongation development (Li et al. 2007; Qin et al. 2008; Mei et al. 2009). Previously, we reported that GhAPX1 plays a significant role in cotton fiber elongation via involving in ethylene signaling pathway (Li et al. 2007). Over-expression of cotton GhAPX1A/D increased fiber resistance to H2O2 stress (Guo et al. 2016). However, detailed knowledge about the whole APX family and expression patterns of APX genes in G. hirsutum remains unclear.

In this work, we performed genome-wide investigation and expression profiling of APX family in G. hirsutum. A total of 26 GhAPX genes were identified. Phylogenetic and gene structure analyses classified these APX members into five clades and syntenic analysis suggested two duplication events. Expression profiling of the 26 APXs revealed that ten members are expressed in cotton fibers. Interestingly, GhAPX10A, GhAPX10D, GhAPX12A, and GhAPX12D showed high expression levels in 30-day fiber, while GhAPX1A/D, GhAPX3A/D, and GhAPX6A/D showed very low expression levels at the same development stage. The enzyme activity and H2O2 content assays revealed that cotton fiber kept high enzyme activity and the lowest H2O2 level in 30-day fibers, indicating that other than GhAPX1, the newly reported APX members are responsible for the reactive oxygen species homeostasis in the secondary cell wall biosynthesis and maturation of cotton fiber development stages. This work provided evolutionary and functional information of GhAPX gene family members and revealed that different GhAPX family members are responsible to redox homeostasis during different cotton fiber development stages.

Materials and methods

Plant growth and different treatments

Cotton plants (G. hirsutum L. cv. Xuzhou 142) and the fuzzless-lintless mutant (fl) were grown in an experimental field at the Institute of Tropical Biosciences and Biotechnology in Haikou, China. Cotton bolls were labeled on the day of anthesis (defined as 0d ) and then detached in different developmental stages (5d, 10d, 15d, 20d, 25d, and 30d). The 10d fibers were treated with 200 µm ethephon (ETH), 100 µm H2O2, 100 µm gibberellin (GA), 100 µm methyl jasmonate (MeJA), 100 µm brassinolide (BR), and 1 mg/L of indole-3-acetic acid (IAA) for 1, 3, 6, and 12 h, respectively (Xin et al. 2016). Different tissues of roots, stems, leaves, petals, anthers, ovules, and fibers were immediately frozen in liquid nitrogen, and then stored at − 80 °C after stripped and separated from each other.

Identification and multiple alignment of GhAPXs

Genome data of G. hirsutum were downloaded from Cotton Genome Project (http://cgp.genomics.org.cn/page/species/index.jsp) and CottonGen database (https://www.cottongen.org/) as described (Li et al. 2015; Zhang et al. 2015). The APX sequences of Arabidopsis, Oryza sativa, and Theobroma cacao were used as seed sequences to obtain the cotton APXs by local BLASTP through searching with a cutoff e value of 1e−10. Obtained 26 GhAPXs were submitted to InterProScan (http://www.ebi.ac.uk/interpro/) to assess the APX domains (IPR002016). Multiple sequence alignment was performed using ClustalX (2.0) with default parameter (Larkin et al. 2007).

Chromosomal location analysis and phylogenetic tree construction

MapInspect software was used to visualize the distribution of the 26 APX genes in G. hirsutum chromosomes. Phylogenetic tree was constructed using MEGA 5.1 software with neighbor-joining method and bootstrap values of 1000 replicates as described (Tamura et al. 2007).

Intron–exon and motif structure analysis of APX family

Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/index.php) was used to analyze the intron–exon structure by comparing the CDS of APX genes with their corresponding genomic sequences (Hu et al. 2015). Deduced protein sequences of GhAPXs were submitted to multiple expectation maximization for motif elicitation (MEME) program for the identification of the conserved motifs (Bailey et al. 2006).

Syntenic and evolutionary analyses

Paralogous GhAPX gene pairs were estimated based on their nucleotide identities > 90%. Tandem duplication events occurred when two closely related GhAPX genes are located within the same chromosome region. Segmental duplication has been defined as paralogous genes. The syntenic relationships of paralogous and orthologous between cotton and a closely related cacao species were analyzed using Circos program (Krzywinski et al. 2009) based on sequence identity calculations and the phylogenetic tree.

Evolutionary analyses were performed as previously reported (Jin et al. 2017). Briefly, the Ka (nonsynonymous substitution rate) and Ks (synonymous substitution rate) were calculated by DnaSP 5.0 software. The Ka/Ks ratios for GhAPX genes were used to assess the selection pressure on duplicated genes and Ka/Ks ratio > 1, < 1, or = 1 indicates positive, negative, or neutral evolution, respectively. Furthermore, Tajima relative rate tests were performed to determine the equality of the evolutionary rate between GhAPX paralogues and orthologues.

RNA extraction and qRT-PCR

Total RNA was extracted from different cotton tissues by a modified hot borate method as described (Shi et al. 2006). Five micrograms of total RNA for each tissue were used to synthesize first-strand cDNA using SuperScript® III first-strand synthesis system for RT-PCR (Invitrogen, Carlsbad, CA, USA). Reverse transcript PCR (RT-PCR) and quantitative real-time PCR (qRT-PCR) were performed using the SYBR green real-time PCR master mixes (Appliedbiosystems, Foster, CA, USA) with specific primers provided in Table S1. The 5′- and 3′-UTR of GhAPXs were obtained by genome-referenced expressed sequence tags (ESTs) assembly as described to facilitate the gene-specific primer design (Jin et al. 2013). The UBQ gene was used as internal control to adjust the amount of template cDNA for quantitative analysis (Jin et al. 2016). The relative expression level of each APX gene was used to generate a heat map using MultiExperiment viewer (MeV, version 4.9) software.

Cis-regulatory elements analysis

The promoters of GhAPXs were downloaded in local database and the Plant CARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) was used to analyze the cis-regulatory element of GhAPXs promoter (Lescot et al. 2002).

Determination of APX enzyme activity and H2O2 content

Different fiber tissues (5d, 10d, 15d, 20d, 25d, and 30d) were used to measure the APX enzyme activity and H2O2 content as described (Li et al. 2007).

Results

Identification of the APX gene family in G. hirsutum

Several reports indicated that GhAPX1 plays important roles during cotton fiber initiation and elongation stages (Fig. S1) (Shi et al. 2006; Li et al. 2007; Yang et al. 2008; Zheng et al. 2014; Guo et al. 2016). To further understand the functional of APX gene family, a genome-wide investigation of GhAPXs was performed. A total of 26 non-redundant GhAPX genes (Table S2) were identified by searching the cotton genome database, and were renamed from GhAPX1A to GhAPX13A and from GhAPX1D to GhAPX13D according to their order in chromosomes of the A and D sub-genomes except for previously reported GhAPX1A and GhAPX1D. The detailed information of these genes were listed in Table 1, including chromosome location, ORF length, protein length, molecular weight, and theoretical isoelectric point.

Table 1 List of APX genes in G. hirsutum

Chromosome distribution analysis showed that GhAPX1A–GhAPX5A, GhAPX7A, GhAPX8A, and GhAPX10A13A were anchored in eight chromosomes of A sub-genome, while GhAPX6A and GhAPX9A were anchored in two un-assembled scaffolds: A-Chr5 scaffold1211 and A-Chr6 scaffold1353, respectively (Fig. 1a, b). For D sub-genome, GhAPX2DGhAPX13D were anchored in eight chromosomes, and GhAPX1D was anchored in D-Chr5 scaffold4074 (Fig. 1c, d). The 26 GhAPXs were predicted to be located in different apparatus including 18 (9 ortholog pairs) in cytoplasm, four in periplasm, and four in outer membrane (Table S3).

Fig. 1
figure 1

Chromosome distribution of GhAPX gene family. The 26 GhAPX genes were mapped to the chromosomes of A sub-genome (a) and un-assembled scaffolds (b), as well as that of D sub-genome (c) and un-assembled scaffold (d). Genes were referred as GhAPX1A–13A and GhAPX1D–13D, according to their organization order on the chromosomes, except for the previously reported GhAPX1A/D. Different scales were used for chromosomes and un-assembled scaffolds. Note that the chromosome information of scaffolds was known, although they could not be assembled to the corresponding chromosomes

Phylogenetic and intron–exon distribution analyses of GhAPX gene family

To in-depth understand the evolutionary and phylogenetic relationships of GhAPXs, a neighbor-joining (NJ) phylogenetic tree was constructed using protein sequences of APXs from Arabidopsis thaliana, G. hirsutum, O. sativa, Vitis vinifera, and T. cacao (Fig. 2a). The 26 GhAPXs could be classified into five well-supported clades labeled with different colours. According to A. thaliana APXs (Panchuk et al. 2005), clade I and clade II consist of cytoplasmic APXs, Clade III contains all known chloroplast APXs, and Clade IV and V include peroxisomal APXs. These phylogenetic data demonstrated similar conclusions to a previously published work (Guo et al. 2016).

Fig. 2
figure 2

Phylogenetic and intron–exon structure analyses of APX family genes. a Protein sequences of APX gene families from Arabidopsis thaliana (AtAPX), G. hirsutum (GhAPX), Oryza sativa (OsAPX), Theobroma cacao (TcAPX), and Vitis vinifera (VvAPX) were used to construct an NJ phylogenetic tree. Bootstrap was set to 1000 replicates. Five clades were named as sub-family I to V, shadowed with different colours. b Intron–exon organization structure analysis of the 26 GhAPX genes was shown. Boxes and lines represented the exons and introns, and the genomic length was indicated at the bottom. (Color figure online)

Gene structures of all the 26 GhAPXs were investigated to further validate the evolution and phylogenetic relationships of GhAPX family members. The GhAPX genes belonged to the same clade in phylogenetic tree shared similar intron–exon organization structures (Fig. 2b).

Syntenic and evolutionary analyses of GhAPX gene family

To investigate the expansion of the APX gene family, syntenic analysis of G. hirsutum and T. cacao APXs was performed using Circos software. Two tandem duplication events were detected in both A- and D-sub-genome (GhAPX10A/11A/12A and GhAPX10D/11D/12D); however, no segmental duplication events were determined because of the high conservation of GhAPXs between A- and D-sub-genome (Fig. 3).

Fig. 3
figure 3

Tandem duplication and segmental duplication of GhAPXs and syntenic analysis between G. hirsutum and T. cacao APXs. Chromosomes and scaffolds from cotton (Gh) and cacao (Tc) are shown in yellow and blue segments, respectively. The positions of the APX genes are marked by black lines in the circus. Duplicated GhAPXs are linked by red lines and syntenic relationships between cotton and cacao are linked by purple lines. (Color figure online)

Evolutionary selection patterns between paralogue and/or orthologue gene pairs can be estimated by the Ka/Ks ratio (Yadav et al. 2015). A Ka/Ks ratio > 1 indicates a positive selection, a Ka/Ks ratio < 1 indicates a purifying selection, and a Ka/Ks ratio = 1 indicates a neutral selection. The Ka/Ks ratios of the duplicated GhAPXs indicated that they all were subjected to purifying selection (Table 2). In addition, Tajima relative rate were calculated to determine whether the GhAPX duplicates evolved at an accelerated rate following the duplication events. Notably, statistically significant increase in evolutionary rate occurred between the GhAPX10A/11A/12A duplicated paralogues, while non-significant evolutionary rate occurred between GhAPX10D/11D/12D (Table 3), indicating a potential functional divergence of these duplicated paralogues.

Table 2 Ka/Ks ratios for duplicate APX genes in G. hirsutum
Table 3 Tajima relative rate tests of APX gene pairs in cotton

Conserved motif analysis of GhAPX family

Multiple sequence alignment showed that all GhAPXs had three conserved domains and variable N-terminus (Fig. S2). Domain I contains two active sites, Domain II has the most conserved 12 amino acids sequences that predicted as Heme-binding site, and Domain III has three proximal cation-binding sites. These data provide possibility that GhAPXs may function in different organelles using ascorbate as substrate to detoxify H2O2.

Conserved motifs in GhAPXs were searched by MEME program to obtain more insights into the diversity of motif compositions and evolutionary relationships, and a total of ten conserved motifs were discovered. The APXs belong to the same clades that share very similar motif composition and order (Fig. 4a). Motif 1, 2, and 7 are existed in all GhAPXs, indicating that they are conserved sections of GhAPXs. Most of cytoplasmic GhAPXs have motifs 1–8, except for orthologs GhAPX4A and GhAPX4D. Chloroplast GhAPXs have all motifs but not motif 6, while the peroxisomal APXs have the least conserved motifs 1, 2, 7, and 9. Motif 10 is only distributed in the chloroplast-located APXs, with a most conserved amino acid sequence of GWGKPETKYTKDGPG (Fig. 4a, b). Motif 9 is observed in the C-terminal of cytoplasmic GhAPX3A/D and GhAPX5A, while in the N-terminal of the peroxisome- and chloroplast-located GhAPXs.

Fig. 4
figure 4

Motif analysis of GhAPXs. Conserved motifs of GhAPX protein sequences were analyzed (a). Ten different motifs were recognized and indicated with different colours. The organization order of motifs for each member of GhAPXs was highlighted. The conservation of the sequences for each conserved domain was also presented (b). (Color figure online)

Tissue- and development-specific expression profiling of GhAPX genes

To understand the expression and function diversity of the 26 GhAPX genes, the tissue- and development-specific expression profiles of GhAPXs were performed using qRT-PCR. Relative expression levels of the 26 GhAPX genes in eight different tissues were demonstrated to construct a heat map (Fig. 5a). All expression level data were normalized using cotton UBQ as internal control, and relative expression level over 0.05-fold to UBQ was considered to be detected. Members of cytoplasmic GhAPX sub-family I were universal expressed, in which GhAPX1A/D, GhAPX10A, GhAPX10D, GhAPX12A, and GhAPX12D were the predominantly expressed GhAPXs, with high levels in leaf, petal, and anther. Members of cytoplasmic GhAPX sub-family II were expressed in much less tissues: GhAPX5A/D was expressed only in leaf, and GhAPX3A/D was expressed in leaf, petal, and anther, indicating the functional diversity of the two clusters of GhAPXs. However, none of chloroplast GhAPXs were detected, except for GhAPX6A/D. For peroxisomal GhAPX sub-family IV and V, GhAPX7A was expressed in leaf and anther, while GhAPX7D expressed only in leaf. The semi-quantitative RT-PCR results were also showed with visible confirmation (Fig. 5b).

Fig. 5
figure 5

Expression profiling of GhAPX genes in different cotton tissues and different development stage of cotton fibers. a Heat map of the qRT-PCR data for the 26 GhAPXs in eight different cotton tissues. Orthologs with very high identities that could not be distinguished by gene-specific primers were examined together use the same primers and indicated by slashes. b Semi-quantitative RT-PCR results of GhAPX genes corresponding to that of in a. c Heat map of the qRT-PCR data for ten fiber-preferentially accumulated GhAPXs in eight development stages of cotton fibers. Orthologs GhAPX1A/D, GhAPX3A/D, and GhAPX6A/D were detected use the same primers, respectively. d Semi-quantitative RT-PCR results of GhAPX genes corresponding to that of in c. Relative expression levels were normalized by the internal control UBQ gene

Interestingly, ten GhAPX genes were detected in 10-day fibers (Fiber-10), and selected to further examine the expression patterns during cotton fiber development. Fibers of eight developmental stages were used for qRT- and RT-PCR analyses, including 5-day (Fiber-5), 10-day (Fiber-10), 15-day (Fiber-15), 20-day (Fiber-20), 25-day (Fiber-25), 30-day (Fiber-30), 10-day WT ovule (Ovule-10), and 10-day fl ovule (fl-Ovule-10). The results showed that these ten GhAPX genes displayed three distinct expression patterns according to different temporal expression feature during fiber development stages. GhAPX1A/D had a predominant steady expression level from 5 days to 25 days of fiber elongation and secondary cell wall biosynthesis. Notably, GhAPX6A/D and GhAPX3A/D were mainly expressed at the late fiber development stage of secondary cell wall biosynthesis (Fig. 5c, d). Meanwhile, GhAPX10A, GhAPX10D, GhAPX12A, and GhAPX12D had the highest expression level at 30 days, the cell apoptosis stage of fiber development. The results of the 26 GhAPX gene expression patterns provide their probable multiple functions in cotton plant development, particularly the potential diverse role in controlling H2O2 concentration during different fiber development stages.

Determination of APX activity and H2O2 content during fiber development

To understand the relationship between APX expression and H2O22homeostasis during fiber development, different tissues of 5d, 10d, 15d, 20d, 25d, and 30d fibers were collected to measure the APX activity and H2O2 content. The level of APX activity reached the peak value in 5d fibers, and maintained a steady high expression with a tendency of decline at fast fiber elongation stages (5–15 dpa), following a slight increase at 20 dpa, which is matched well to the gene expression level of GhAPXs, indicating that there may be diverse GhAPXs functioning in different developmental stages (Fig. 6a). Meanwhile, H2O2 content demonstrated an ascending trend at the fast fiber elongation stages (5–15 dpa) with highest concentration in 20d fibers, and then decrease gradually at the secondary cell wall synthesis stages (20–30 dpa). The results imply the possibility that some GhAPXs accumulated in the secondary cell wall biosynthesis and maturation stages of fiber development may involve in H2O2 scavenging (Fig. 6b).

Fig. 6
figure 6

Dynamic changes of the APX enzyme activity and H2O2 content in different stages of cotton fibers. a Dynamic changes of APX enzyme activity in different stages of cotton fibers (from 5 to 30 dpa). b Dynamic changes of H2O2 content in different stages of cotton fibers (from 5 to 30 dpa)

Analyses of cis-regulatory element and gene expression profiling of GhAPXs in response to stimulations of H2O2 and phytohormone

To further investigate the regulatory mechanism of the GhAPX gene family members, especially the duplicated paralogues, the cis-elements were scanned in the promoter regions of GhAPXs (Fig. 7a–e). A 1500-bp sequence upstream of the translational start site was considered as a putative promoter region, and thus was used to analyze the distribution of cis-regulatory elements. The cis-elements were characterized and indicated with capital letters labeled by different colours, including two core cis-elements, nine stress response elements, and six phytohormone response elements which were characterized (Fig. 7, Table S4), which implies that the GhAPXs gene expression is under control of stimulation responsiveness of stress and phytohormone. Notably, in the process of cotton plant growth and development, similar cis-element distribution pattern was found in the promoter regions of the duplicated paralogues of common or higher expressing GhAPX genes, while the duplicated paralogues of lower or non-expressing GhAPX genes displayed different distributions with non-regular constitutions.

Fig. 7
figure 7

Cis-regulatory elements analysis and expression profiling of GhAPXs response to plant hormone (ae). The putative cis-regulatory elements of GhAPXs were labeled with capital letters in the figure according to their relative position. The green letters stand for stress response cis-elements, the blue letters stand for plant hormone response cis-elements, and the red letters stand for transcription initiation cis-elements. More details about cis-regulatory elements are shown in S4 Table. f Heat map of the qRT-PCR data for ten fiber-preferentially accumulated GhAPXs in 10-day fibers treated with different plant hormones at different hours. Orthologs GhAPX1A/D, GhAPX3A/D, and GhAPX6A/D were detected using the same primers, respectively. Relative expression levels were normalized by the internal control UBQ gene. (Color figure online)

In view of H2O2 and phytohormone important functions in cotton fiber development, PKc enzyme activity is positive related to the H2O2 content and negatively correlated with fast fiber elongation in cotton. Phytohormones such as auxins, ethylene, and brassinosteroids are involved in regulation of fiber development (Shi et al. 2006; Pang et al. 2010; Chen and Guan 2011; Zhang et al. 2011, 2016). The expression patterns of GhAPXs in cotton fibers under oxidative stress and phytohormone stimulation were determined by treating 10-day WT fibers by H2O2, ETH, IAA, BR, JA and GA for 1, 3, 6, and 12 h. The results indicated that GhAPX1A/D, GhAPX6A/D, GhAPX12A, GhAPX12D were positively respond to ethylene stimulation. GhAPX6A/D, GhAPX10A, and GhAPX12A displayed induced expression after GA treatment, while GhAPX10A, GhAPX10D, and GhAPX12A were significantly increased after JA stimulation. No changes were detected after treatments of IAA and BR. Remarkably, all the fiber-expressed GhAPX members except for GhAPX3A/D illustrated significant induced expression after H2O2 treatment (Fig. 7f). These results suggested that GhAPX genes may perform multiple functions in the process of H2O2 and phytohormone regulated cotton fiber development. In addition, there exists the appearance that duplicated paralogues displayed different responsive characteristics, implying their functional and regulatory diversity (Fig. 7f).

Discussion

DNA sequencing data of the cotton genome provide us valuable information of gene family in Gossypium to further understand gene function and regulation mechanism (Yao et al. 2012). Ascorbate peroxidase is known as the key enzyme detoxifying H2O2 and performs vital roles in plant growth and development and stress responsiveness (Fryer et al. 2003; Davletova et al. 2005). Whereas the diverse functions of GhAPX members remain unclear, especially in cotton fiber development. Thus, to comprehensively understand GhAPXs’ various roles and the regulatory mechanism, based on our previous study about GhAPX1, here, a complete overview of this GhAPX family in G. hirsutum is presented, as well as the expression profiling characteristics. Totally, 26 APX genes were identified according to the complete genome of G. hirsutum, locating onto 8 chromosomes of A- or D-sub-genome (Fig. 1; Table 1), while only 8, 8, 7, and 7 APX genes were characterized in Arabidoisis, O. sativa, T. cacao, and V. vinifera, respectively (Panchuk et al. 2005; Teixeira et al. 2006).

Phylogenetic analysis of the reported 74 APX members in different plant species showed that the APXs can be classified into four clades with different putative subcellular locations (Teixeira et al. 2004). However, the 26 cotton GhAPXs reported here were divided into five sub-families with putative different subcellular distributions according to the orthologous APX genes from A. thaliana (Fig. 2a). Intron–exon structure analysis presented high consistence with phylogenetic classification (Fig. 2b). Different APX sub-families displayed lower identities, indicating significant original and functional diversity of GhAPX gene family.

The alignment of 74 reported plant APX sequences revealed two signatures in plant chloroplast isoforms including 7 residues next to the active site (K-[ND]-I-[ETK]-E-W-P), and 16 residues near heme-binding site (E-T-K-Y-T-[KE]-[DNTE]-G-PG-[ANEK]-[PA]-G-G-Q-S), respectively. Phylogenetic analysis among different species showed that these 74 APXs were classified into 4 clades with different subcellular locations (Teixeira et al. 2004). We found that all APX proteins contain three conserved domains (Fig. S2). Study of APX gene family in O. sativa illustrated that a sequence of gene duplications led to the current diversity of isoforms (Teixeira et al. 2004), suggesting that these unique motifs may be responsible for diverse functions in different isoforms. The specific exon I and II of chloroplast APX gene SlApx7 and SlApx6 encoded for the organellar targeting sequences of the proteins (Najami et al. 2008). Under these circumstances, we deduce that the particular conserved motifs in chloroplast and cytoplasmic GhAPXs, as well as different intron–exon structures (Fig. 2 and Table S3) may have their specific possible action in targeting the organelles.

Gene expression patterns are usually closely related to their functions, and analyses of differential expression profiles can provide important information with gene families. (Guo et al. 2008). Eight APX members containing three cytosolic, two chloroplastic, and three microsomal isoforms were characterized in A. thaliana, in which APX1 and APX3 were appeared to be high expressed, while APX2 and APX5 were low expressed between different age leaves (Panchuk et al. 2005). KO-APX1 experiment in A. thaliana showed that cytosolic APX1 plays an important role in protecting chloroplast from H2O2 damage, and stromal/mitochondrial APX can be the first chloroplast line to defend against the diffusion of H2O2 from cytosol into the chloroplast (Davletova et al. 2005). In Solanum lycopersicum, dominant expressions of SlAPX6 in leaves and SlAPX7 in stems were observed (Najami et al. 2008). Members of GhAPX family were characterized with different expression profiles in root, leaf, anther, and fiber, ten GhAPX genes demonstrated fiber-specific expressions with distinct patterns according to their abundant accumulation in different stages of fiber development (Fig. 5). It has been demonstrated that GhAPX1 is highly up-regulated during fiber fast elongating stages (Li et al. 2007; Yang et al. 2008; Zheng et al. 2014; Guo et al. 2016), which is consistent with the current result that GhAPX1A/D are mainly expressed in the elongation and secondary cell wall biosynthesis stages of fiber development. Interestingly, in the secondary cell wall biosynthesis and maturation stages of fiber development, four genes of GhAPX10A/D and GhAPX12A/D were mostly enriched; meanwhile, H2O2 content indicated a tendency of decline (Fig. 6), indicating potential possibility that the four GhAPXs may be major members controlling intracellular H2O2 levels in the maturation stages of fiber development.

H2O2 and phytohormone are key factors in regulating fiber development (Li et al. 2007; Triplett et al. 2007). Many genes have been investigated that perform essential functions in fiber development through responding to phytohormone, cotton CesA were reported to increase fiber number per seed after auxin and gibberellin treatments (Triplett et al. 2007), gibberellin could induce significant expression of cotton KCS gene in cotton fibers (Xiao et al. 2016), and the transcription of the cotton AOCs was increased after JA treatment (Wang et al. 2015). Our previous work showed that GhAPX1 is involved in the response to ethylene and H2O2 stimulations (Li et al. 2007). The current investigations of the expression profiles of the APX genes in response to phytohormone and H2O2 treatments indicated that fiber-expressed GhAPXs are expressed under control of ethylene, GA, and JA. Notably, fiber-expressed GhAPXs except for GhAPX3A/D were responded to H2O2 stimulation, suggesting that these GhAPXs may be participated in redox homeostasis. Distribution analysis of cis-elements of the promoter regions of GhAPX genes supplies potential regulation mechanism of APX responding to H2O2 and phytohormone (Fig. 7). In summary, through analyses of genome-wide survey and expression profiling of GhAPX gene family, we provided some new insights in controlling H2O2 homeostasis during fiber development, that is decided by the ten fiber-preferentially accumulated GhAPXs.

In conclusion, we performed thoroughly investigation of upland cotton GhAPX gene family. The evolutionary analyses suggested a significant increase in evolutionary rate between the A-sub-genome duplicated paralogue genes GhAPX10A/11A/12A, while non-significant evolutionary rate between GhAPX10D/11D/12D. Tissue- and development-specific expression profiling of GhAPX genes revealed that 10 members were expressed in cotton fiber and GhAPX10A, GhAPX10D, GhAPX12A, and GhAPX12D showed high expression levels in 30-day fiber, while GhAPX1A/D, GhAPX3A/D, and GhAPX6A/D showed relative low expression levels. Together with the APX enzyme activity and H2O2 content assay, we demonstrated that different GhAPX family members are responsible for redox homeostasis during different cotton fiber development stages.