Abstract
Identification of the sites of post-translational modifications (PTMs) in protein, RNA, and DNA sequences is currently a very hot topic. This is because the information thus obtained is very useful for in-depth understanding the biological processes at the cellular level and for developing effective drugs against major diseases including cancers as well. Although this can be done by means of various experimental techniques, it is both time-consuming and costly to determine the PTM sites purely based on experiments. With the avalanche of biological sequences generated in the post-genomic age, it is highly desired to develop bioinformatics tools for rapidly and effectively identifying the PTM sites. In the last few years, many efforts have been made in this regard, and considerable progresses have been achieved. This review is focused on those prediction methods that have the following two features. (1) They have been developed by strictly observing the 5-steps rule so that they each have a user-friendly web-server for the majority of experimental scientists to easily get their desired data without the need to go through the detailed mathematics involved. (2) Their cornerstones have been based on Pseudo Amino Acid Composition (PseAAC) or Pseudo K-tuple Nucleotide Composition (PseKNC), and hence the prediction quality is generally higher than most of the other PTM prediction methods.
Introduction
Post-translational modification, or PTM, means the covalent and generally enzymatic modification of proteins right after they are biosynthesized. After being synthesized by ribosomes, proteins may undergo PTM to form the mature protein products. PTMs can occur on the amino acid side chains of a protein or at its C- or N- terminus. They can covalently modify the existing functional group of an amino acid and make it have other functional group. Therefore, the chemical repertoire of the 20 standard amino acids can be considerably extended via the process of PTMs.
According to their occurrence in three different type of biological sequences, PTMs can be classified into the following three different categories: (1) PTLM (post-translational modification) in proteins, (2) PTCM (post-transcriptional modification) in RNA, and (3) PTRM (post-replication modification) in DNA. PTMs play a key role in providing bio-macromolecules with structural and functional diversity, as well as in regulating cellular plasticity and dynamics. Meanwhile, PTMs are also closely associated with many major diseases including cancer, Alzheimer’s, and Parkinson’s. Therefore, identifying the PTM sites in biological sequences is very important for both basic research and drug development.
Historical Reflection
Before going on, it is illuminative to make a historical reflection. For quite a long period of time, the information derived by the computational approaches were not trusted very much by most experimental scientists due to the notorious local minimum problem (Chou and Zhang 1995). Actually, they only trusted the results determined by the experiments, and thought that computational results were not reliable unless they had been confirmed by experiments. This kind of situation has been changed during the last two decades or so owing to the rapid development of structural bioinformatics and sequential bioinformatics. For the 3D structures of proteins, what they trusted most were those determined by the X-ray crystallography. Unfortunately, it is time-consuming and expensive, and not all proteins can be successfully crystallized. Membrane proteins are difficult to crystallize and most of them will not dissolve in normal solvents. Accordingly, so far very few membrane protein structures have been determined. NMR is indeed a very powerful tool in determining the 3D structures of membrane proteins (see, e.g., Chou et al. 1998, 1999, 2001; Oxenoid et al. 2016; Dev et al. 2016; Schnell and Chou 2008; Berardi et al. 2011; OuYang et al. 2013; Wang et al. 2009a; Fu et al. 2016; Oxenoid and Chou 2005; Call et al. 2006, 2010; Gagnon et al. 2010; Bruschweiler et al. 2015; Cao et al. 2017; Piai et al. 2017; Pan et al. 2019a), but it is also time-consuming and costly. In order to acquire the structural information in a timely manner, a series of 3D protein structures have been developed by means of structural bioinformatics tools (see, e.g., Chou et al. 1997, 2000; Chou 2004a, b, c, 2005a, b, c; Chou and Howe 2002; Wang and Du 2007; Wang et al. 2009b; Li et al. 2011; Ma et al. 2012) and they have been found very useful in conducting mutagenesis studies (Chou 2004d) for rational drug design. Meanwhile, facing the explosive growth of biological sequences discovered in the post-genomic age, to timely use them for drug development, a lot of important sequence-based information have been deducted by various sequential bioinformatics tools such as PseAAC approach (Chou 2001, 2005d, e) and PseKNC approach (Chen et al. 2014a, 2015; Guo et al. 2014). Actually, the rapid development in sequential bioinformatics and structural bioinformatics have driven the medicinal chemistry undergoing an unprecedented revolution (Chou 2015), in which the computational biology has played increasingly important roles in stimulating the development of finding novel drugs (Zhong and Zhou 2014, 2016, 2017).
As it was in the last few years that many bioinformatics tools were developed for predicting the PTM sites in biological sequences (Chou 2015; Xie et al. 2013; Xu and Ding 2013; Xu et al. 2013, 2014a, b; Jia et al. 2014, 2016a, b, c, d; Qiu et al. 2014, 2015, 2016a, b, c, 2017a, b, c, 2018a; Zhang et al. 2014a; Chen et al. 2015b, 2016, 2018a, b; Ju et al. 2016; Liu et al. 2016; Xu 2016; Feng et al. 2017; Ju and He 2017a; Liu and Xu 2017; Xu and Li 2017; Akbar and Hayat 2018; Chandra et al. 2018; Ghauri et al. 2018; Ju and Wang 2018; Khan et al. 2018a, b; Sabooh et al. 2018; Hussain et al. 2019a; Li et al. 2019; Wang et al. 2019; Ning et al. 2019; Ehsan et al. 2019) in compliance with the Chou’s 5-steps rule (Chou 2011) by going through the following five procedures: (1) how to select or construct a valid benchmredict subcellular localization of mark dataset to train and test the predictor; (2) how to represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; (3) how to introduce or develop a powerful algorithm to conduct the prediction; (4) how to properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; (5) how to establish a user-friendly web-server for the predictor that is accessible to the public.
Many prediction methods as reported in refs. (Xu and Ding 2013; Xu et al. 2013, 2014a, b; Qiu et al. 2014, 2015, 2016b, c, 2017a, b, c, 2018a; Chen et al. 2012, 2013, 2014b, c, 2015b, 2016a, b, c, 2017, 2018a, b, c, d; Jia et al. 2015, 2016e, 2016; Liu 2015a, 2016a, b, c, d; Feng et al. 2017; Liu and Xu 2017; Xu and Li 2017; Ghauri et al. 2018; Khan et al. 2018a, b; Hussain et al. 2019a, b; Ning et al. 2019; Ehsan et al. 2019; Min and Xiao 2013; Liu et al. 2014a, b, 2015b, c, 2016b, c, 2017, 2018a, b, c; Xiao et al. 2013a, 2015; Ding et al. 2014; Fan et al. 2014; Lin et al. 2014; Qiu and Xiao 2014; Xu et al. 2015; Liu and Long 2016; Xiao et al. 2016, 2017, 2018a, b, c; Zhang et al. 2016, 2018, 2019; Cheng and Xiao 2017a, b, 2018a, b, c, d, e; Cheng et al. 2017a, b, c; Liu and Yang 2017; Chou et al. 2018; Ehsan et al. 2018; Li et al. 2018a, b, c; Song et al. 2018a, b, c; Su et al. 2018; Wang et al. 2018a, b; Yang et al. 2018; Jia et al. 2019; Khan et al. 2019a, b; Lu et al. 2019a, b; Chou 2019; Awais et al. 2019; Niu et al. 2019) have been presented by strictly observing the Chou’s 5-steps rule (Chou 2011), sharing the following notable merits: (1) crystal clear in logic development, (2) complete transparent in operation, (3) quite easy to repeat the reported results by others, (4) holding high potential in stimulating other sequence-analyzing methods, and (5) very convenient to be used by broad experimental scientists.
Therefore, focused in the current review paper are only those PTM prediction methods that were born through the Chou’s 5-steps rule (Chou 2011). As for the importance of the 5-steps rule and how to use it in developing new predictor for proteome and genome analyses, see an insightful Wikipedia article at https://en.wikipedia.org/wiki/5-step_rules.
Besides, with the avalanche of biological sequences in the post-genomic era, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, yet still considerably keep its sequence-order information or key pattern characteristic. This is because all the existing machine-learning algorithms [such as “Optimization” algorithm (Zhang 1992), “Covariance Discriminant” or “CD” algorithm (Chou and Elrod 2002; Chou and Cai 2003), “Nearest Neighbor” or “NN” algorithm (Hu et al. 2011), and “Support Vector Machine” or “SVM” algorithm (Hu et al. 2011; Cai et al. 2006)] can only handle vectors as elaborated in a comprehensive review (Chou 2015).
However, a vector defined in a discrete model may completely lose all the sequence-pattern information. To avoid completely losing the sequence-pattern information for proteins, the pseudo amino acid composition (Chou 2001) or PseAAC (Chou 2005d) was proposed. Ever since the concept of Chou’s PseAAC was proposed, it has been widely used in nearly all the areas of computational proteomics (see, e.g., Xie et al. 2013; Jia et al. 2014; Zhang et al. 2014a; Ju et al. 2016; Ju and He 2017a; Akbar and Hayat 2018; Ghauri et al. 2018; Ju and Wang 2018; Sabooh et al. 2018; Hussain et al. 2019a, b; Wang et al. 2019; Ning et al. 2019; Ehsan et al. 2019; Zhou et al. 2007; Ding and Zhang 2008; Fang et al. 2008; Jiang et al. 2008a, b; Li and Li 2008; Lin 2008; Lin et al. 2008; Nanni and Lumini 2008; Zhang and Fang 2008; Zhang et al. 2008; Zhang et al. 2008; Zhang et al. 2008; Chen et al. 2009, 2012; Ding et al. 2009; Georgiou et al. 2009; Li et al. 2009, 2012, 2014; Lin et al. 2009; Qiu et al. 2009, 2010, 2011, 2017d; Zeng et al. 2009; Esmaeili et al. 2010; Gu et al. 2010; Mohabatkar 2010; Sahu and Panda 2010; Yu et al. 2010; Guo et al. 2011; Lin and Wang 2011; Mohabatkar et al. 2011; Mohammad et al. 2011; Zou et al. 2011; Cao et al. 2012, 2013; Du et al. 2012; Fan and Li 2012a, b; Hayat and Khan 2012; Liao et al. 2012; Liu et al. 2012, 2013, 2015d, e; Mei 2012a, b; Nanni et al. 2012a, b, 2014; Niu et al. 2012; Qin et al. 2012; Ren et al. 2012; Sun et al. 2012; Zhao et al. 2012a, b; Zia-ur-Rehman 2012; Chang et al. 2013; Chen and Li 2013; Fan et al. 2013; Fan and Li 2013; Georgiou et al. 2013; Gupta et al. 2013; Huang and Yuan 2013a, b, c; Khosravian et al. 2013; Lin et al. 2013; Mohabatkar et al. 2013; Pacharawongsakda and Theeramunkong 2013; Qin et al. 2013; Sarangi et al. 2013; Wan et al. 2013; Wang et al. 2013; Xiaohui et al. 2013; Du et al. 2014; Hajisharifi et al. 2014; Han et al. 2014; Hayat and Iqbal 2014; Kong et al. 2014; Mondal and Pai 2014; Zhang et al. 2014b, c, 2015; Zuo et al. 2014; Ahmad et al. 2015, 2016; Ali and Hayat 2015; Dehzangi et al. 2015; Fan et al. 2015; Huang and Yuan 2015; Khan et al. 2015; Kumar et al. 2015; Mandal et al. 2015; Sanchez et al. 2015; Sharma et al. 2015; Wang et al. 2015; Zhang 2015; Behbahani et al. 2016; Fan et al. 2016; Jiao and Du 2016; Kabir and Hayat 2016; Tahir and Hayat 2016; Tang et al. 2016; Tiwari 2016; Xu et al. 2016; Zou and Xiao 2016a, b; Meher et al. 2017; Huo et al. 2017; Jiao and Du 2017; Ju and He 2017b; Khan et al. 2017; Liang and Zhang 2017, 2018; Rahimi et al. 2017; Tahir et al. 2017; Tripathi and Pandey 2017; Xu et al. 2017; Yu et al. 2017a, b; Ahmad and Hayat 2018; Al Maruf and Shatabda 2018; Arif et al. 2018; Butt et al. 2018, 2019; Contreras-Torres 2018; Cui et al. 2018; Fu et al. 2018; Javed and Hayat 2018; Krishnan 2018; Mei et al. 2018; Mei and Zhao 2018a, b; Mousavizadegan and Mohabatkar 2018; Qiu et al. 2018b; Rahman et al. 2018; Sankari and Manimegalai 2018; Srivastava et al. 2018; Tahir et al. 2019a, b; Zhang and Kong 2018, 2019; Zhang and Duan 2018; Zhang and Liang 2018; Zhang et al. 2018; Zhao et al. 2018; Adilina et al. 2019; Ahmad and Hayat 2019; Chen et al. 2019; Kabir et al. 2019; Le et al. 2019; Pan et al. 2019b; Shen et al. 2019; Tian et al. 2019) as well as a long list of references cited in Chou (2017).
Because it has been widely and increasingly used, four powerful open access soft-wares, called ‘PseAAC’ (Shen 2008), ‘PseAAC-Builder’ (Du et al. 2012), ‘propy’ (Cao et al. 2013), and ‘PseAAC-General’ (Du et al. 2014), were established: the former three are for generating various modes of Chou’s special PseAAC (Chou 2009); while the fourth one for those of Chou’s general PseAAC (Chou 2011), including not only all the special modes of feature vectors for proteins but also the higher level feature vectors such as “Functional Domain” mode (see Eqs. 9–10 of Chou 2011), “Gene Ontology” mode (see Eqs. 11–12 of Chou 2011), and “Sequential Evolution” or “PSSM” mode (see Eqs. 13–14 of Chou 2011).
Meanwhile, the idea of PseAAC was extended to generate various modes of feature vectors for DNA and RNA sequences (Chen et al. 2014a, 2015a; Guo et al. 2014; Chen and Lin 2015; Liu et al. 2015f, g, 2016d; Liu and Wu 2017), and has been proved very useful as well.
Predictors for Identifying PTM or PTLM Sites in Protein Sequences
Listed in Table 1 are 16 predictors for identifying the PTM sites in protein sequences (Xu and Ding 2013; Xu et al. 2013, 2014a, b; Qiu et al. 2014, 2015, 2016a, b, c, 2017; Jia et al. 2016a, b, c, d; Liu and Xu 2017; Xu and Li 2017). Of the 16 predictors, iPTM-mLys (Qiu et al. 2016b) has the capacity to identify multiple lysine PTM sites and their different types. Therefore, its performance or accuracy needs the following two sets of metrics to measure it (Chou 2019).
One is for its global accuracy, as given by
where \( {\text{N}}^{\text{q}} \) is the total number of query or tested samples, M is the total number of different labels for the investigated system, || || means the operator acting on the set therein to count the number of its elements, \( \cup \) means the symbol for the “union” in the set theory, \( \cap \) denotes the symbol for the “intersection”, \( {\mathbb{L}}_{k} \) denotes the subset that contains all the labels observed by experiments for the k-th tested sample, \( {\mathbb{L}}_{k}^{*} \) represents the subset that contains all the labels predicted for the k-th sample, and.
In Eq. 1, the first four metrics with an upper arrow \( \uparrow \) are called positive metrics, meaning that the larger the rate is the better the prediction quality will be; the fifth metrics with a down arrow \( \downarrow \) is called negative metrics, implying just the opposite meaning. As we can see from Eq. 1: (1) the “Aiming” defined by the 1st sub-equation is for checking the rate or percentage of the correctly predicted labels over the practically predicted labels; (2) the “Coverage” defined in the second sub-equation is for checking the rate of the correctly predicted labels over the actual labels in the system concerned; (3) the “Accuracy” in the 3rd sub-equation is for checking the average ratio of correctly predicted labels over the total labels including correctly and incorrectly predicted labels as well as those real labels but are missed in the prediction; (4) the “Absolute true” in the 4th sub-equation is for checking the ratio of the perfectly or completely correct prediction events over the total prediction events; (5) the “Absolute false” in the 5th sub-equation is for checking the ratio of the completely wrong prediction over the total prediction events.
The five metrics in Eq. 1 reflect the quality of a multi-label predictor from five different angles at the global level. It is instructive to point out, however, among the five global metrics the most important one and also the most difficult to improve its success rate is the “Absolute true” or “perfectly correct” rate (Chou 2013). Why? This is because the score standard for the absolute true rate is very harsh. According to its definition, for a statistical sample that is actually simultaneously with the states (“A”, “B”, “C”). If the predicted result is not exactly the three states but (“A”, “B”) or (“A”, “B”, “C”, “D”), no score whatsoever will be given. In other words, when and only when the predicted outcome for the statistical sample is perfectly identical to its actual status, can we add one point for the absolute true rate; otherwise, zero. That is why many investigators even chose not to mention the metrics of absolute true rate; otherwise they would face the embarrassment of reporting a very low success rate for their prediction methods.
The set of metrics in Eq. 1 are used to evaluate the prediction quality of a multi-label predictor for all the samples in the entire system concerned (Chou 2019), and hence is called the “set of metrics for the global accuracy” or the “set of global metrics”.
The other one is for its local accuracy, as given by
where Sn, Sp, Acc, and MCC represent the sensitivity, specificity, accuracy, and Mathew’s correlation coefficient, respectively (Chen et al. 2007), i denotes the i-th subset or subcellular location (Chou 2019) in the benchmark dataset, and M has exactly the same meaning as in Eq. 1. \( N^{ + } \left( i \right) \) is the total number of the samples investigated in the i-th subset, whereas \( N_{ - }^{ + } \left( i \right) \) is the number of the samples in \( N^{ + } \left( i \right) \) that are incorrectly predicted to be of other subset or locations; \( N^{ - } \left( i \right) \) is the total number of samples in any subset except for the i-th subset, whereas \( N_{ + }^{ - } \left( i \right) \) is the number of the samples in \( N^{ - } \left( i \right) \) that are incorrectly predicted to be in the i-th subset.
The set of metrics of Eq. 3 were derived (Xu and Ding 2013; Chen et al. 2013) based on the symbols originally introduced by Chou (2001b, c, d) for studying the cleavage sites of signal peptides. Owing to its merit in intuitiveness, Eq. 3 has been widely concurred and admired by many scientists (Chen et al. 2014; Guo et al. 2014; Chen et al. 2007, 2013, 2014b, c,2015a, b, 2016a, b, c, 2017, 2018a, b; Chou 2013, 2015, 2017, 2019; Xu and Ding 2013; Xu et al. 2013, 2014a; Qiu et al. 2014, 2016a, b, c, 2017a, b, c; Jia et al. 2015, 2016a, b, c, d, f, 2019; Ju et al. 2016; Liu et al. 2014a, 2015a, b, c, f, g, h, 2016a, b, c, d, 2017a, b; Xu 2016; Feng et al. 2017; Liu and Xu 2017; Xu and Li 2017; Hussain et al. 2019a, b; Li et al. 2019; Min and Xiao 2013; Xiao et al. 2013a, 2016; Ding et al. 2014; Fan et al. 2014; Lin et al. 2014; 2014; Qiu and Xiao 2014; Jia et al. 2016e; Liu and Long 2016; Cheng and Xiao 2017; Cheng et al. 2017a; Liu and Yang 2017; Ehsan et al. 2018; Li et al. 2018a, b; Song et al. 2018c; Wang et al. 2018; Khan et al. 2019a; Zhang et al. 2017, 2019; Fan and Li 2013; Ali and Hayat 2015; Meher et al. 2017; Huo et al. 2017; Khan et al. 2017; Arif et al. 2018; Krishnan 2018; Zhang and Kong 2018; Chen and Lin 2015; Ju et al. 2015; Wang 2013; Xiao and Lin 2013; Xiao et al. 2013b, c, d; Xiao and Wang 2013; Xia et al. 2014; Yu et al. 2014; Cai et al. 2015; Su et al. 2017; Cheng et al. 2019; Feng et al. 2019), and used to examine the prediction quality of most PTM predictors (Xu and Ding 2013; Xu et al. 2013; Qiu et al. 2014; Xu et al. 2014; Xu et al. 2014; Qiu et al. 2015; Jia et al. 2016a, b, c; Chandra et al. 2018; Ghauri et al. 2018; Khan et al. 2018a, b; Qiu et al. 2018a). Meanwhile, it has also been widely used in proteome and genome analyses (see, e.g., Xu et al. 2014; Chen et al. 2014, 2015b, 2016a, b, c, 2017; Jia et al. 2015, 2016b, d; Ding et al. 2014; Lin et al. 2014; Qiu and Xiao 2014; Liu et al. 2015c; Liu and Yang 2017; Li et al. 2018a; Song et al. 2018c) to evaluate the prediction quality of a multi-label predictor for the proteins in each of subcellular locations concerned (see, e.g., Cheng and Xiao 2017a, b; Cheng et al. 2017a; Xiao et al. 2017, 2018a, b; Cheng and Xiao 2018a, b, c, d, e; Chou et al. 2018; Chou 2019; Cheng et al. 2019). But it is instructive to point out that either the set of conventional metrics copied from math books or the intuitive metrics derived from the Chou’s symbols (Chou 2001b, c, d) are valid only for the single-label systems (where each sample only belongs to one and only one class). For the multi-label systems (where a sample may simultaneously belong to several classes), whose existence has become more frequent in system biology (Cheng and Xiao 2017a, b, 2018a, b; Cheng et al. 2017; Xiao et al. 2017, 2018a; Chou 2019), system medicine (Cheng et al. 2017; Cheng et al. 2017), and biomedicine (Qiu et al. 2016b, 2019; Cheng et al. 2019), a completely different set of metrics as defined in Chou (2013) is absolutely needed.
Predictors for Identifying PTM or PTCM Sites in RNA Sequences
Concluding Remarks and Perspectives
The PTM predictors introduced in this review paper have been all established by following the 5-steps rule (Chou 2011), and hence they each have a user-friendly web server for the majority of experimental scientists to easily get their desired data. Also, their cornerstones are based on PseAAC (Chou 2001, 2005d, e, 2009, 2011) or PseKNC (Chen et al. 2013, 2014a; Lin et al. 2014; Chen and Lin 2015; Liu et al. 2015g, 2016), and hence their prediction quality is usually higher than the other PTM prediction methods.
As we can see from the “Predictors for Identifying PTM or PTLM Sites in Protein Sequences”, “Predictors for Identifying PTM or PTCM Sites in RNA Sequences”, and “Predictors for Identifying PTM or PTRM Sites in DNA Sequences” sections, the most web-servers available are for identifying the PTM sites in protein sequences, the next are for DNA sequences, and the least for RNA sequences. It is anticipated, however, that with more experimental data available in the future, the benchmark datasets for the PTM sites in RNA and DNA sequences will be enriched as well. The existing web-servers will not only be easily extended to cover more RNA and DNA sequences, but also further improve the prediction quality in all kinds of biological sequences.
Finally, it has not escaped our notice that using graphic approaches to study biological and medical systems can provide an intuitive vision and useful insights for helping analyze complicated relations therein as shown in the systems of enzyme fast reaction (Chou and Forsen 1980; Li and Forsen 1980a, b), graphical rules in molecular biology (Chou and Forsen 1980, 1981; Forsen and Zhou 1980; Carter and Forsen 1981), and low-frequency internal motion in biomacromolecules (such as protein and DNA) (Chen and Forsen 1981). Particularly, what happened is that this kind of insightful implication has also been demonstrated in Chou et al. (1979) and many follow-up publications (Zhou and Deng 1984; Chou 1989, 1990; Althaus et al. 1993a, b, c, 1994a, b, 1996; Chou et al. 1994; Andraos 2008; Chou and Shen 2009; Shen and Song 2009; Chou 2010, 2011; Zhou 2011).
References
Adilina S, Farid DM, Shatabda S (2019) Effective DNA binding protein prediction by using key features via Chou’s general PseAAC. J Theor Biol 460:64–78
Ahmad J, Hayat M (2018) MFSC: multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou’s PseAAC components. J Theor Biol 463:99–109
Ahmad J, Hayat M (2019) MFSC: multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou’s PseAAC components. J Theor Biol 463:99–109
Ahmad S, Kabir M, Hayat M (2015) Identification of heat shock protein families and J-protein types by incorporating dipeptide composition into Chou’s general PseAAC. Comput Methods Programs Biomed 122:165–174
Ahmad K, Waris M, Hayat M (2016) Prediction of protein submitochondrial locations by incorporating dipeptide composition into Chou’s general pseudo amino acid composition. J Membr Biol 249:293–304
Akbar S, Hayat M (2018) iMethyl-STTNC: identification of N(6)-methyladenosine sites by extending the Idea of SAAC into Chou’s PseAAC to formulate RNA sequences. J Theor Biol 455:205–211
Al Maruf MA, Shatabda S (2018) iRSpot-SF: prediction of recombination hotspots by incorporating sequence based features into Chou’s Pseudo components. Genomics. https://doi.org/10.1016/j.ygeno.2018.06.003
Ali F, Hayat M (2015) Classification of membrane protein types using voting feature interval in combination with Chou’s pseudo amino acid composition. J Theor Biol 384:78–83
Althaus IW, Chou JJ, Gonzales AJ, Diebel MR, Kezdy FJ, Romero DL, Aristoff PA, Tarpley WG, Reusser F (1993a) Steady-state kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-87201E. J Biol Chem 268:6119–6124
Althaus IW, Gonzales AJ, Chou JJ, Diebel MR, Kezdy FJ, Romero DL, Aristoff PA, Tarpley WG, Reusser F (1993b) The quinoline U-78036 is a potent inhibitor of HIV-1 reverse transcriptase. J Biol Chem 268:14875–14880
Althaus IW, Chou JJ, Gonzales AJ, Diebel MR, Kezdy FJ, Romero DL, Aristoff PA, Tarpley WG, Reusser F (1993c) Kinetic studies with the nonnucleoside HIV-1 reverse transcriptase inhibitor U-88204E. Biochemistry 32:6548–6554
Althaus IW, Chou JJ, Gonzales AJ, Diebel MR, Kezdy FJ, Romero DL, Aristoff PA, Tarpley WG, Reusser F (1994a) Steady-state kinetic studies with the polysulfonate U-9843, an HIV reverse transcriptase inhibitor. Cell Mol Life Sci (Experientia) 50:23–28
Althaus IW, Chou JJ, Gonzales AJ, Diebel MR, Kezdy FJ, Romero DL, Thomas RC, Aristoff PA, Tarpley WG, Reusser F (1994b) Kinetic studies with the non-nucleoside human immunodeficiency virus type-1 reverse transcriptase inhibitor U-90152e. Biochem Pharmacol 47:2017–2028
Althaus IW, Franks KM, Diebel MR, Kezdy FJ, Romero DL, Thomas RC, Aristoff PA, Tarpley WG, Reusser F (1996) The benzylthio-pyrididine U-31,355, a potent inhibitor of HIV-1 reverse transcriptase. Biochem Pharmacol 51:743–750
Andraos J (2008) Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws: new methods based on directed graphs. Can J Chem 86:342–357
Arif M, Hayat M, Jan Z (2018) iMem-2LSAAC: a two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into Chou’s pseudo amino acid composition. J Theor Biol 442:11–21
Awais M, Hussain W, Khan YD, Rasool N, Khan SA (2019) iPhosH-PseAAC: Identify phosphohistidine sites in proteins by blending statistical moments and position relative features according to the Chou’s 5-step rule and general pseudo amino acid composition. IEEE/ACM Trans Comput Biol Bioinf. https://doi.org/10.1109/tcbb.2019.2919025
Behbahani M, Mohabatkar H, Nosrati M (2016) Analysis and comparison of lignin peroxidases between fungi and bacteria using three different modes of Chou’s general pseudo amino acid composition. J Theor Biol 411:1–5
Berardi MJ, Shih WM, Harrison SC, Chou JJ (2011) Mitochondrial uncoupling protein 2 structure determined by NMR molecular fragment searching. Nature 476:109–113
Bruschweiler S, Yang Q, Run C, Chou JJ (2015) Substrate-modulated ADP/ATP-transporter dynamics revealed by NMR relaxation dispersion. Nat Struct Mol Biol 22:636–641
Butt AH, Rasool N, Khan YD (2018) Predicting membrane proteins and their types by extracting various sequence features into Chou’s general PseAAC. Mol Biol Rep. https://doi.org/10.1007/s11033-018-4391-5
Butt AH, Rasool N, Khan YD (2019) Prediction of antioxidant proteins by incorporating statistical moments based features into Chou’s PseAAC. J Theor Biol 473:1–8
Cai YD, Feng KY, Lu WC (2006) Using LogitBoost classifier to predict protein structural classes. J Theor Biol 238:172–176
Cai L, Wan CL, He L, Jong S (2015) Gestational influenza increases the risk of psychosis in adults. Med Chem 11:676–682
Call ME, Schnell JR, Xu C, Lutz RA, Chou JJ, Wucherpfennig KW (2006) The structure of the zetazeta transmembrane dimer reveals features essential for its assembly with the T cell receptor. Cell 127:355–368
Call ME, Wucherpfennig KW, Chou JJ (2010) The structural basis for intramembrane assembly of an activating immunoreceptor complex. Nat Immunol 11:1023–1029
Cao JZ, Liu WQ, Gu H (2012) Predicting viral protein subcellular localization with Chou’s Pseudo amino acid composition and imbalance-weighted multi-label K-nearest neighbor algorithm. Protein Pept Lett 19:1163–1169
Cao DS, Xu QS, Liang YZ (2013) Propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics 29:960–962
Cao C, Wang S, Cui T, Su XC, Chou JJ (2017) Ion and inhibitor binding of the double-ring ion selectivity filter of the mitochondrial calcium uniporter. Proc Natl Acad Sci USA 114:E2846–E2851
Carter RE, Forsen S (1981) A new graphical method for deriving rate equations for complicated mechanisms. Chem Scr 18:82–86
Chandra A, Sharma A, Dehzangi A, Ranganathan S, Jokhan A, Tsunoda T (2018) PhoglyStruct: prediction of phosphoglycerylated lysine residues using structural properties of amino acids. Sci Rep 8:17923
Chang TH, Wu LC, Lee TY, Chen SP, Huang HD, Horng JT (2013) EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou’s PseAAC. J Comput Aided Mol Des 27:91–103
Chen NY, Forsen S (1981) The biological functions of low-frequency phonons: 2. Cooperative effects. Chem Scr 18:126–132
Chen YK, Li KB (2013) Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou’s pseudo amino acid composition. J Theor Biol 318:1–12
Chen W, Lin H (2015) Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol BioSyst 11:2620–2634
Chen J, Liu H, Yang J (2007) Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33:423–428
Chen C, Chen L, Zou X, Cai P (2009) Prediction of protein secondary structure content by using the concept of Chou’s pseudo amino acid composition and support vector machine. Protein Pept Lett 16:27–31
Chen W, Lin H, Feng PM, Ding C, Zuo YC (2012a) iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS ONE 7:e47843
Chen C, Shen ZB, Zou XY (2012b) Dual-layer wavelet SVM for predicting protein structural class via the general form of Chou’s pseudo amino acid composition. Protein Pept Lett 19:422–429
Chen W, Feng PM, Lin H (2013) iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res 41:e68
Chen W, Lei TY, Jin DC, Lin H (2014a) PseKNC: a flexible web-server for generating pseudo K-tuple nucleotide composition. Anal Biochem 456:53–60
Chen W, Feng PM, Deng EZ, Lin H (2014b) iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal Biochem 462:76–83
Chen W, Feng PM, Lin H (2014c) iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. Biomed Res Int (BMRI) 2014:623149
Chen W, Zhang X, Brooker J, Lin H, Zhang L (2015a) PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions. Bioinformatics 31:119–120
Chen W, Feng P, Ding H, Lin H (2015b) iRNA-Methyl: identifying N6-methyladenosine sites using pseudo nucleotide composition. Anal Biochem 490:26–33
Chen W, Tang H, Ye J, Lin H (2016a) iRNA-PseU: identifying RNA pseudouridine sites. Mol Ther Nucleic Acids 5:e332
Chen W, Ding H, Feng P, Lin H (2016b) iACP: a sequence-based tool for identifying anticancer peptides. Oncotarget 7:16895–16909
Chen W, Feng P, Ding H, Lin H (2016c) Using deformation energy to analyze nucleosome positioning in genomes. Genomics 107:69–75
Chen W, Feng P, Yang H, Ding H, Lin H (2017) iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget 8:4208–4217
Chen W, Ding H, Zhou X, Lin H (2018a) iRNA(m6A)-PseDNC: identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal Biochem 561–562:59–65
Chen W, Feng P, Yang H, Ding H, Lin H (2018b) iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites. Molecular Therapy: Nucleic Acid 11:468–474
Chen Z, Liu X, Li F, Li C, Marquez-Lago T, Leier A, Akutsu T, Webb GI, Xu D, Smith AI, Li L, Song J (2018c) Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief Bioinform. https://doi.org/10.1093/bib/bby089
Chen Z, Zhao PY, Li F, Leier A, Marquez-Lago TT, Wang Y, Webb GI, Smith AI, Daly RJ, Song J (2018d) iFeature: a python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 34:2499–2502
Chen G, Cao M, Yu J, Guo X, Shi S (2019) Prediction and functional analysis of prokaryote lysine acetylation site by incorporating six types of features into Chou’s general PseAAC. J Theor Biol 461:92–101
Cheng X, Xiao X (2017a) pLoc-mPlant: predict subcellular localization of multi-location plant proteins via incorporating the optimal GO information into general PseAAC. Mol BioSyst 13:1722–1727
Cheng X, Xiao X (2017b) pLoc-mVirus: predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene (Erratum: ibid., 2018, Vol.644, 156–156) 628: 315–321
Cheng X, Xiao X (2018a) pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 110:50–58
Cheng X, Xiao X (2018b) pLoc-mGneg: predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 110:231–239
Cheng X, Xiao X (2018c) pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics 34:1448–1456
Cheng X, Xiao X (2018d) pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. J Theor Biol 458:92–102
Cheng X, Xiao X (2018e) pLoc_bal-mPlant: predict subcellular localization of plant proteins by general PseAAC and balancing training dataset. Curr Pharm Des 24:4013–4022
Cheng X, Zhao SG, Lin WZ, Xiao X (2017a) pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics 33:3524–3531
Cheng X, Zhao SG, Xiao X (2017b) iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics (Corrigendum, ibid., 2017, Vol.33, 2610) 33: 341–346
Cheng X, Zhao SG, Xiao X (2017c) iATC-mHyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals. Oncotarget 8:58494–58503
Cheng X, Lin WZ, Xiao X (2019) pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC. Bioinformatics 35:398–406
Chou KC (1989) Graphic rules in steady and non-steady enzyme kinetics. J Biol Chem 264:12074–12079
Chou KC (1990) Review: applications of graph theory to enzyme kinetics and protein folding kinetics. Steady and non-steady state systems. Biophys Chem 35:1–24
Chou KC (2001a) Prediction of protein cellular attributes using pseudo amino acid composition. Proteins: structure, function, and genetics (Erratum: ibid., 2001, Vol. 44, 60) 43: 246–255
Chou KC (2001b) Prediction of protein signal sequences and their cleavage sites. Proteins 42:136–139
Chou KC (2001c) Using subsite coupling to predict signal peptides. Protein Eng 14:75–79
Chou KC (2001d) Prediction of signal peptides using scaled window. Peptides 22:1973–1979
Chou KC (2004a) Insights from modelling the 3D structure of the extracellular domain of alpha7 nicotinic acetylcholine receptor. Biochem Biophys Res Commun (BBRC) 319:433–438
Chou KC (2004b) Insights from modelling the tertiary structure of BACE2. J Proteome Res 3:1069–1072
Chou KC (2004c) Insights from modelling three-dimensional structures of the human potassium and sodium channels. J Proteome Res 3:856–861
Chou KC (2004d) Review: structural bioinformatics and its impact to biomedical science. Curr Med Chem 11:2105–2134
Chou KC (2005a) Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. J Proteome Res 4:1681–1686
Chou KC (2005b) Modeling the tertiary structure of human cathepsin-E. Biochem Biophys Res Commun (BBRC) 331:56–60
Chou KC (2005c) Insights from modeling the 3D structure of DNA-CBF3b complex. J Proteome Res 4:1657–1660
Chou KC (2005d) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19
Chou KC (2005e) Review: progress in protein structural class prediction and its impact to bioinformatics and proteomics. Curr Protein Pept Sci 6:423–436
Chou KC (2009) Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Curr Proteom 6:262–274
Chou KC (2010) Graphic rule for drug metabolism systems. Curr Drug Metab 11:369–378
Chou KC (2011) Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review, 5-steps rule). J Theor Biol 273:236–247
Chou KC (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol BioSyst 9:1092–1100
Chou KC (2015) Impacts of bioinformatics to medicinal chemistry. Med Chem 11:218–234
Chou KC (2017) An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Curr Top Med Chem 17:2337–2358
Chou KC (2019) Advance in predicting subcellular localization of multi-label proteins and its implication for developing multi-target drugs. Curr Med Chem. https://doi.org/10.2174/0929867326666190507082559
Chou KC, Cai YD (2003) Prediction and classification of protein subcellular location: sequence-order effect and pseudo amino acid composition. J Cell Biochem (Addendum, ibid. 2004, 91, 1085) 90: 1250–1260
Chou KC, Elrod DW (2002) Bioinformatical analysis of G-protein-coupled receptors. J Proteome Res 1:429–433
Chou KC, Forsen S (1980a) Diffusion-controlled effects in reversible enzymatic fast reaction system: critical spherical shell and proximity rate constants. Biophys Chem 12:255–263
Chou KC, Forsen S (1980b) Graphical rules for enzyme-catalyzed rate laws. Biochem J 187:829–835
Chou KC, Forsen S (1981) Graphical rules of steady-state reaction systems. Can J Chem 59:737–755
Chou KC, Howe WJ (2002) Prediction of the tertiary structure of the beta-secretase zymogen. Biochem Biophys Res Commun (BBRC) 292:702–708
Chou KC, Shen HB (2009) FoldRate: a web-server for predicting protein folding rates from primary sequence. Open Bioinf J 3:31–50
Chou KC, Zhang CT (1995) Review: prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349
Chou KC, Jiang SP, Liu WM, Fee CH (1979) Graph theory of enzyme kinetics: 1. Steady-state reaction system. Sci Sinica 22:341–358
Chou KC, Kezdy FJ, Reusser F (1994) Review: kinetics of processive nucleic acid polymerases and nucleases. Anal Biochem 221:217–230
Chou KC, Jones D, Heinrikson RL (1997) Prediction of the tertiary structure and substrate binding site of caspase-8. FEBS Lett 419:49–54
Chou JJ, Matsuo H, Duan H, Wagner G (1998) Solution structure of the RAIDD CARD and model for CARD/CARD interaction in caspase-2 and caspase-9 recruitment. Cell 94:171–180
Chou JJ, Li H, Salvessen GS, Yuan J, Wagner G (1999) Solution structure of BID, an intracellular amplifier of apoptotic signalling. Cell 96:615–624
Chou KC, Tomasselli AG, Heinrikson RL (2000) Prediction of the tertiary structure of a caspase-9/inhibitor complex. FEBS Lett 470:249–256
Chou JJ, Li S, Klee CB, Bax A (2001) Solution structure of Ca2+-calmodulin reveals flexible hand-like properties of its domains. Nature Structural Biology 8:990–997
Chou KC, Lin WZ, Xiao X (2011) Wenxiang: a web-server for drawing wenxiang diagrams. Nat Sci 3:862–865
Chou KC, Cheng X, Xiao X (2018) pLoc_bal-mHum: predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset. Genomics. https://doi.org/10.1016/j.ygeno.2018.08.007
Contreras-Torres E (2018) Predicting structural classes of proteins by incorporating their global and local physicochemical and conformational properties into general Chou’s PseAAC. J Theor Biol 454:139–145
Cui X, Yu Z, Yu B, Wang M, Tian B, Ma Q (2018) UbiSitePred: a novel method for improving the accuracy of ubiquitination sites prediction by using LASSO to select the optimal Chou’s pseudo components. Chemom Intell Lab Syst (CHEMOLAB). https://doi.org/10.1016/j.chemolab.2018.11.012
Dehzangi A, Heffernan R, Sharma A, Lyons J, Paliwal K, Sattar A (2015) Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou’s general PseAAC. J Theor Biol 364:284–294
Dev J, Park D, Fu Q, Chen J, Ha HJ, Ghantous F, Herrmann T, Chang W, Liu Z, Frey G, Seaman MS, Chen B, Chou JJ (2016) Structural basis for membrane anchoring of HIV-1 envelope spike. Science 353:172–175
Ding YS, Zhang TL (2008) Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recogn Lett 29:1887–1892
Ding H, Luo L, Lin H (2009) Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition. Protein Pept Lett 16:351–355
Ding H, Deng EZ, Yuan LF, Liu L, Lin H, Chen W (2014) iCTX-Type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int (BMRI) 2014:286419
Du P, Wang X, Xu C, Gao Y (2012) PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou’s pseudo amino acid compositions. Anal Biochem 425:117–119
Du P, Gu S, Jiao Y (2014) PseAAC-General: fast building various modes of general form of Chou’s pseudo amino acid composition for large-scale protein datasets. Int J Mol Sci 15:3495–3506
Ehsan A, Mahmood K, Khan YD, Khan SA (2018) A novel modeling in mathematical biology for classification of signal peptides. Sci Rep 8:1039
Ehsan A, Mahmood MK, Khan YD, Barukab OM, Khan SA (2019) iHyd-PseAAC (EPSV): identify hydroxylation sites in proteins by extracting enhanced position and sequence variant feature via Chou’s 5-step rule and general pseudo amino acid composition. Curr Genomics 20:124–133
Esmaeili M, Mohabatkar H, Mohsenzadeh S (2010) Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol 263:203–209
Fan GL, Li QZ (2012a) Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition. J Theor Biol 304:88–95
Fan GL, Li QZ (2012b) Predicting protein submitochondria locations by combining different descriptors into the general form of Chou’s pseudo amino acid composition. Amino Acids 43:545–555
Fan GL, Li QZ (2013) Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition. J Theor Biol 334:45–51
Fan G-L, Li Q-Z, Zuo Y-C (2013) Predicting acidic and alkaline enzymes by incorporating the average chemical shift and gene ontology informations into the general form of Chou’s PseAAC. Process Biochem 48:1048–1053
Fan YN, Xiao X, Min JL (2014) iNR-Drug: predicting the interaction of drugs with nuclear receptors in cellular networking. Int J Mol Sci (IJMS) 15:4915–4937
Fan GL, Zhang XY, Liu YL, Nang Y, Wang H (2015) DSPMP: discriminating secretory proteins of malaria parasite by hybridizing different descriptors of Chou’s pseudo amino acid patterns. J Comput Chem 36:2317–2327
Fan GL, Liu YL, Wang H (2016) Identification of thermophilic proteins by incorporating evolutionary and acid dissociation information into Chou’s general pseudo amino acid composition. J Theor Biol 407:138–142
Fang Y, Guo Y, Feng Y, Li M (2008) Predicting DNA-binding proteins: approached from Chou’s pseudo amino acid composition and other specific sequence features. Amino Acids 34:103–109
Feng P, Ding H, Yang H, Chen W, Lin H (2017) iRNA-PseColl: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Mol Ther Nucleic Acids 7:155–163
Feng P, Yang H, Ding H, Lin H, Chen W (2019) iDNA6 mA-PseKNC: identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 111:96–102
Forsen S, Zhou GQ (1980) Three schematic rules for deriving apparent rate constants. Chem Scr 16:109–113
Fu Q, Fu TM, Cruz AC, Sengupta P, Thomas SK, Wang S, Siegel RM, Wu H, Chou JJ (2016) Structural basis and functional role of intramembrane trimerization of the Fas/CD95 death receptor. Mol Cell 61:602–613
Fu X, Zhu W, Liso B, Cai L, Peng L, Yang J (2018) Improved DNA-binding protein identification by incorporating evolutionary information into the Chou’s PseAAC. IEEE Access 20 https://doi.org/10.1109/access.2018.2876656
Gagnon E, Xu C, Yang W, Chu HH, Call ME, Chou JJ, Wucherpfennig KW (2010) Response multilayered control of T cell receptor phosphorylation. Cell 142:669–671
Georgiou DN, Karakasidis TE, Nieto JJ, Torres A (2009) Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition. J Theor Biol 257:17–26
Georgiou DN, Karakasidis TE, Megaritis AC (2013) A short survey on genetic sequences, Chou’s pseudo amino acid composition and its combination with fuzzy set theory. Open Bioinf J 7:41–48
Ghauri AW, Khan YD, Rasool N, Khan SA (2018) pNitro-Tyr-PseAAC: predict nitrotyrosine sites in proteins by incorporating five features into Chou’s general PseAAC. Curr Pharm Des 24:4034–4043
Gu Q, Ding YS, Zhang TL (2010) Prediction of G-protein-coupled receptor classes in low homology using Chou’s pseudo amino acid composition with approximate entropy and hydrophobicity patterns. Protein Pept Lett 17:559–567
Guo J, Rao N, Liu G, Yang Y, Wang G (2011) Predicting protein folding rates using the concept of Chou’s pseudo amino acid composition. J Comput Chem 32:1612–1617
Guo SH, Deng EZ, Xu LQ, Ding H, Lin H, Chen W (2014) iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 30:1522–1529
Gupta MK, Niyogi R, Misra M (2013) An alignment-free method to find similarity among protein sequences via the general form of Chou’s pseudo amino acid composition. SAR QSAR Environ Res 24:597–609
Hajisharifi Z, Piryaiee M, Mohammad Beigi M, Behbahani M, Mohabatkar H (2014) Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J Theor Biol 341:34–40
Han GS, Yu ZG, Anh V (2014) A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou’s PseAAC. J Theor Biol 344:31–39
Hayat M, Iqbal N (2014) Discriminating protein structure classes by incorporating Pseudo average chemical shift to Chou’s general PseAAC and support vector machine. Comput Methods Programs Biomed 116:184–192
Hayat M, Khan A (2012) Discriminating outer membrane proteins with Fuzzy K-nearest neighbor algorithms based on the general form of Chou’s PseAAC. Protein Pept Lett 19:411–421
Hu L, Huang T, Shi X, Lu WC, Cai YD (2011) Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS ONE 6:e14556
Huang C, Yuan J (2013a) Using radial basis function on the general form of Chou’s pseudo amino acid composition and PSSM to predict subcellular locations of proteins with both single and multiple sites. Biosystems 113:50–57
Huang C, Yuan JQ (2013b) A multilabel model based on Chou’s pseudo amino acid composition for identifying membrane proteins with both single and multiple functional types. J Membr Biol 246:327–334
Huang C, Yuan JQ (2013c) Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou’s pseudo amino acid compositions. J Theor Biol 335:205–212
Huang C, Yuan JQ (2015) Simultaneously identify three different attributes of proteins by fusing their three different modes of Chou’s pseudo amino acid compositions. Protein Pept Lett 22:547–556
Huo H, Li T, Wang S, Lv Y, Zuo Y, Yang L (2017) Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components. Sci Rep 7:5827
Hussain W, Khan SD, Rasool N, Khan SA (2019a) SPalmitoylC-PseAAC: a sequence-based model developed via Chou’s 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Anal Biochem 568:14–23
Hussain W, Khan YD, Rasool N, Khan SA (2019b) SPrenylC-PseAAC: a sequence-based model developed via Chou’s 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. J Theor Biol 468:1–11
Javed F, Hayat M (2018) Predicting subcellular localizations of multi-label proteins by incorporating the sequence features into Chou’s PseAAC. Genomics. https://doi.org/10.1016/j.ygeno.2018.09.004
Jia C, Lin X, Wang Z (2014) Prediction of protein S-nitrosylation sites based on adapted normal distribution Bi-profile bayes and Chou’s pseudo amino acid composition. Int J Mol Sci 15:10410–10423
Jia J, Liu Z, Xiao X (2015) iPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J Theor Biol 377:47–56
Jia J, Liu Z, Xiao X, Liu B (2016a) iSuc-PseOpt: identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset. Anal Biochem 497:48–56
Jia J, Liu Z, Xiao X, Liu B (2016b) pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. J Theor Biol 394:223–230
Jia J, Liu Z, Xiao X, Liu B (2016c) iCar-PseCp: identify carbonylation sites in proteins by Monto Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget 7:34558–34570
Jia J, Zhang L, Liu Z, Xiao X (2016d) pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics 32:3133–3141
Jia J, Liu Z, Xiao X, Liu B (2016e) Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition (iPPBS-PseAAC). J Biomol Struct Dyn (JBSD) 34:1946–1961
Jia J, Liu Z, Xiao X, Liu B (2016f) iPPBS-Opt: a sequence-based ensemble classifier for identifying protein-protein binding sites by optimizing imbalanced training datasets. Molecules 21:E95
Jia J, Li X, Qiu W, Xiao X (2019) iPPI-PseAAC(CGR): identify protein-protein interactions by incorporating chaos game representation into PseAAC. J Theor Biol 460:195–203
Jiang X, Wei R, Zhang TL, Gu Q (2008a) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396
Jiang X, Wei R, Zhao Y, Zhang T (2008b) Using Chou’s pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location. Amino Acids 34:669–675
Jiao YS, Du PF (2016) Prediction of Golgi-resident protein types using general form of Chou’s pseudo amino acid compositions: approaches with minimal redundancy maximal relevance feature selection. J Theor Biol 402:38–44
Jiao YS, Du PF (2017) Predicting protein submitochondrial locations by incorporating the positional-specific physicochemical properties into Chou’s general pseudo-amino acid compositions. J Theor Biol 416:81–87
Ju Z, He JJ (2017a) Prediction of lysine crotonylation sites by incorporating the composition of k-spaced amino acid pairs into Chou’s general PseAAC. J Mol Graph Model 77:200–204
Ju Z, He JJ (2017b) Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou’s PseAAC. J Mol Graph Model 76:356–363
Ju Z, Wang SY (2018) Prediction of citrullination sites by incorporating k-spaced amino acid pairs into Chou’s general pseudo amino acid composition. Gene 664:78–83
Ju Z, Cao JZ, Gu H (2015) iLM-2L: a two-level predictor for identifying protein lysine methylation sites and their methylation degrees by incorporating K-gap amino acid pairs into Chous general PseAAC. J Theor Biol 385:50–57
Ju Z, Cao JZ, Gu H (2016) Predicting lysine phosphoglycerylation with fuzzy SVM by incorporating k-spaced amino acid pairs into Chou’s general PseAAC. J Theor Biol 397:145–150
Kabir M, Hayat M (2016) iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou’s PseAAC to formulate DNA samples. Mol Genet Genomics 291:285–296
Kabir M, Ahmad S, Iqbal M, Hayat M (2019) iNR-2L: a two-level sequence-based predictor developed via Chou’s 5-steps rule and general PseAAC for identifying nuclear receptors and their families. Genomics. https://doi.org/10.1016/j.ygeno.2019.02.006
Khan ZU, Hayat M, Khan MA (2015) Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 365:197–203
Khan M, Hayat M, Khan SA, Iqbal N (2017) Unb-DPC: identify mycobacterial membrane protein types by incorporating un-biased dipeptide composition into Chou’s general PseAAC. J Theor Biol 415:13–19
Khan YD, Rasool N, Hussain W, Khan SA (2018a) iPhosT-PseAAC: identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Anal Biochem 550:109–116
Khan YD, Rasool N, Hussain W, Khan SA (2018b) iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC. Mol Biol Rep. https://doi.org/10.1007/s11033-018-4417-z
Khan YD, Jamil M, Hussain W, Rasool N, Khan SA (2019a) pSSbond-PseAAC: prediction of disulfide bonding sites by integration of PseAAC and statistical moments. J Theor Biol 463:47–55
Khan YD, Batool A, Rasool N, Khan A (2019b) Prediction of nitrosocysteine sites using position and composition variant features. Lett Org Chem 16:283–293
Khosravian M, Faramarzi FK, Beigi MM, Behbahani M, Mohabatkar H (2013) Predicting antibacterial peptides by the concept of Chou’s pseudo amino acid composition and machine learning methods. Protein Pept Lett 20:180–186
Kong L, Zhang L, Lv J (2014) Accurate prediction of protein structural classes by incorporating predicted secondary structure information into the general form of Chou’s pseudo amino acid composition. J Theor Biol 344:12–18
Krishnan MS (2018) Using Chou’s general PseAAC to analyze the evolutionary relationship of receptor associated proteins (RAP) with various folding patterns of protein domains. J Theor Biol 445:62–74
Kumar R, Srivastava A, Kumari B, Kumar M (2015) Prediction of beta-lactamase and its class by Chou’s pseudo amino acid composition and support vector machine. J Theor Biol 365:96–103
Le NQK, Yapp EKY, Ho QT, Nagasundaram N, Ou YY, Yeh HY (2019) iEnhancer-5Step: identifying enhancers using hidden information of DNA sequences via Chou’s 5-step rule and word embedding. Anal Biochem 571:53–61
Li TT, Forsen S (1980a) The critical spherical shell in enzymatic fast reaction systems. Biophys Chem 12:265–269
Li TT, Forsen S (1980b) The flow of substrate molecules in fast enzyme-catalyzed reaction systems. Chem Scr 16:192–196
Li FM, Li QZ (2008) Predicting protein subcellular location using Chou’s pseudo amino acid composition and improved hybrid approach. Protein Pept Lett 15:612–616
Li ZC, Zhou XB, Dai Z, Zou XY (2009) Prediction of protein structural classes by Chou’s pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis. Amino Acids 37:415–425
Li XB, Wang SQ, Xu WR, Wang RL (2011) Novel inhibitor design for hemagglutinin against H1N1 influenza virus by core hopping method. PLoS ONE 6:e28111
Li LQ, Zhang Y, Zou LY, Zhou Y, Zheng XQ (2012) Prediction of protein subcellular multi-localization based on the general form of Chou’s pseudo amino acid composition. Protein Pept Lett 19:375–387
Li L, Yu S, Xiao W, Li Y, Li M, Huang L, Zheng X, Zhou S, Yang H (2014) Prediction of bacterial protein subcellular localization by incorporating various features into Chou’s PseAAC and a backward feature selection approach. Biochimie 104:100–107
Li F, Li C, Marquez-Lago TT, Leier A, Akutsu T, Purcell AW, Smith AI, Lightow T, Daly RJ, Song J (2018a) Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics 34:4223–4231
Li F, Wang Y, Li C, Marquez-Lago TT, Leier A, Rawlings ND, Haffari G, Revote J, Akutsu T, Purcell AW, Pike RN, Webb GI, Ian Smith A, Lithgow T, Daly RJ, Whisstock JC, Song J (2018b) Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods. Brief Bioinform. https://doi.org/10.1093/bib/bby077
Li JX, Wang SQ, Du QS, Wei H, Li XM, Meng JZ, Wang QY, Xie NZ, Huang RB (2018c) Simulated protein thermal detection (SPTD) for enzyme thermostability study and an application example for pullulanase from Bacillus deramificans. Curr Pharm Des 24:4023–4033
Li F, Zhang Y, Purcell AW, Webb GI, Lithgow T, Li C, Song J (2019) Positive-unlabelled learning of glycosylation sites in the human proteome. BMC Bioinformatics 20:112
Liang Y, Zhang S (2017) Predict protein structural class by incorporating two different modes of evolutionary information into Chou’s general pseudo amino acid composition. J Mol Graph Model 78:110–117
Liang Y, Zhang S (2018) Identify Gram-negative bacterial secreted protein types by incorporating different modes of PSSM into Chou’s general PseAAC via Kullback-Leibler divergence. J Theor Biol 454:22–29
Liao B, Xiang Q, Li D (2012) Incorporating secondary features into the general form of Chou’s PseAAC for predicting protein structural class. Protein Pept Lett 19:1133–1138
Lin H (2008) The modified Mahalanobis discriminant for predicting outer membrane proteins by using Chou’s pseudo amino acid composition. J Theor Biol 252:350–356
Lin J, Wang Y (2011) Using a novel AdaBoost algorithm and Chou’s pseudo amino acid composition for predicting protein subcellular localization. Protein Pept Lett 18:1219–1225
Lin H, Ding H, Feng-Biao Guo FB, Zhang AY, Huang J (2008) Predicting subcellular localization of mycobacterial proteins by using Chou’s pseudo amino acid composition. Protein Pept Lett 15:739–744
Lin H, Wang H, Ding H, Chen YL, Li QZ (2009) Prediction of subcellular localization of apoptosis protein using Chou’s pseudo amino acid composition. Acta Biotheor 57:321–330
Lin H, Ding C, Yuan L-F, Chen W, Ding H, Li Z-Q, Guo F-B, Huang J, Rao N-N (2013) Predicting subchloroplast locations of proteins based on the general form of Chou’s pseudo amino acid composition: approached from optimal tripeptide composition. Int J Biomath 6:1350003
Lin H, Deng EZ, Ding H, Chen W (2014) iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res 42:12961–12972
Liu B, Long R (2016) iDHS-EL: identifying DNase I hypersensi-tivesites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics 32:2411–2418
Liu B, Wu H (2017) Pse-in-One 2.0: An improved package of web servers for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nat Sci 9:67–91
Liu LM, Xu Y (2017) iPGK-PseAAC: identify lysine phosphoglycerylation sites in proteins by incorporating four different tiers of amino acid pairwise coupling information into the general PseAAC. Med Chem 13:552–559
Liu B, Yang F (2017) 2L-piRNA: a two-layer ensemble classifier for identifying piwi-interacting RNAs and their function. Mol Ther Nucleic Acids 7:267–277
Liu L, Hu XZ, Liu XX, Wang Y, Li SB (2012) Predicting protein fold types by the general form of Chou’s Pseudo amino acid composition: approached from optimal feature extractions. Protein Pept Lett 19:439–449
Liu B, Wang X, Zou Q, Dong Q, Chen Q (2013) Protein remote homology detection by combining Chou’s pseudo amino acid composition and profile-based protein representation. Mol Inf 32:775–782
Liu B, Zhang D, Xu R, Xu J, Wang X, Chen Q, Dong Q (2014a) Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics 30:472–479
Liu B, Xu J, Lan X, Xu R, Zhou J, Wang X (2014b) iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition. PLoS ONE 9:e106691
Liu B, Fang L, Liu F, Wang X, Chen J (2015a) Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS ONE 10:e0121501
Liu B, Fang L, Wang S, Wang X, Li H (2015b) Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. J Theor Biol 385:153–159
Liu Z, Xiao X, Qiu WR (2015c) iDNA-Methyl: identifying DNA methylation sites via pseudo trinucleotide composition. Anal Biochem 474:69–77
Liu B, Chen J, Wang X (2015d) Protein remote homology detection by combining Chou’s distance-pair pseudo amino acid composition and principal component analysis. Mol Genet Genomics 290:1919–1931
Liu J, Xu S, Fan R, Xu J Jiyun, Zhou X Wang (2015e) PseDNA-Pro: DNA-binding protein identification by combining Chou’s PseAAC and physicochemical distance transformation. Mol Inf 34:8–17
Liu B, Liu F, Wang X, Chen J, Fang L (2015f) Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res 43:W65–W71
Liu B, Liu F, Fang L, Wang X (2015g) repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects. Bioinformatics 31:1307–1309
Liu Z, Xiao X, Qiu WR (2015h) Benchmark data for identifying DNA methylation sites via pseudo trinucleotide composition. Data Brief 4:87–89
Liu Z, Xiao X, Yu DJ, Jia J, Qiu WR (2016a) pRNAm-PC: predicting N-methyladenosine sites in RNA sequences via physical-chemical properties. Anal Biochem 497:60–67
Liu B, Fang L, Liu F, Wang X (2016b) iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. J Biomol Struct Dyn (JBSD) 34:223–235
Liu B, Fang L, Long R, Lan X (2016c) iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics 32:362–369
Liu B, Liu F, Fang L, Wang X (2016d) repRNA: a web server for generating various feature vectors of RNA sequences. Mol Genet Genomics 291:473–481
Liu B, Wang S, Long R (2017a) iRSpot-EL: identify recombination spots with an ensemble learning approach. Bioinformatics 33:35–41
Liu B, Wu H, Zhang D, Wang X (2017b) Pse-analysis: a python package for DNA/RNA and protein/peptide sequence analysis based on pseudo components and kernel methods. Oncotarget 8:13338–13343
Liu B, Li K, Huang DS (2018a) iEnhancer-EL: identifying enhancers and their strength with ensemble learning approach. Bioinformatics 34:3835–3842
Liu B, Weng F, Huang DS (2018b) iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC. Bioinformatics 34:3086–3093
Liu B, Yang F, Huang DS (2018c) iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34:33–40
Lu Y, Wang S, Wang J, Zhou G, Zhang Q, Zhou X, Niu B, Chen Q (2019a) An Epidemic avian influenza prediction model based on google trends. Lett Org Chem 16:303–310
Lu F, Zhu M, Lin Y, Zhong H, Cai L, He L (2019b) The preliminary efficacy evaluation of the CTLA-4-Ig treatment against Lupus nephritis through in silico analyses. J Theor Biol 471:74–81
Ma Y, Wang SQ, Xu WR, Wang RL (2012) Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS ONE 7:e38546
Mandal M, Mukhopadhyay A, Maulik U (2015) Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou’s PseAAC. Med Biol Eng Comput 53:331–344
Meher PK, Sahu TK, Saini V, Rao AR (2017) Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci Rep 7:42362
Mei S (2012a) Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization. J Theor Biol 293:121–130
Mei S (2012b) Predicting plant protein subcellular multi-localization by Chou’s PseAAC formulation based multi-label homolog knowledge transfer learning. J Theor Biol 310:80–87
Mei J, Zhao J (2018a) Prediction of HIV-1 and HIV-2 proteins by using Chou’s pseudo amino acid compositions and different classifiers. Sci Rep 8:2359
Mei J, Zhao J (2018b) Analysis and prediction of presynaptic and postsynaptic neurotoxins by Chou’s general pseudo amino acid composition and motif features. J Theor Biol 427:147–153
Mei J, Fu Y, Zhao J (2018) Analysis and prediction of ion channel inhibitors by using feature selection and Chou’s general pseudo amino acid composition. J Theor Biol 456:41–48
Min JL, Xiao X (2013) iEzy-Drug: a web server for identifying the interaction between enzymes and drugs in cellular networking. BioMed Res Int (BMRI) 2013:701317
Mohabatkar H (2010) Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett 17:1207–1214
Mohabatkar H, Mohammad Beigi M, Esmaeili A (2011) Prediction of GABA(A) receptor proteins using the concept of Chou’s pseudo amino acid composition and support vector machine. J Theor Biol 281:18–23
Mohabatkar H, Beigi MM, Abdolahi K, Mohsenzadeh S (2013) Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med Chem 9:133–137
Mohammad BM, Behjati M, Mohabatkar H (2011) Prediction of metalloproteinase family based on the concept of Chou’s pseudo amino acid composition using a machine learning approach. J Struct Funct Genomics 12:191–197
Mondal S, Pai PP (2014) Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction. J Theor Biol 356:30–35
Mousavizadegan M, Mohabatkar H (2018) Computational prediction of antifungal peptides via Chou’s PseAAC and SVM. J Bioinform Comput Biol 16:1850016
Nanni L, Lumini A (2008) Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization. Amino Acids 34:653–660
Nanni L, Brahnam S, Lumini A (2012a) Wavelet images and Chou’s pseudo amino acid composition for protein classification. Amino Acids 43:657–665
Nanni L, Lumini A, Gupta D, Garg A (2012b) Identifying bacterial virulent proteins by fusing a set of classifiers based on variants of Chou’s pseudo amino acid composition and on evolutionary information. IEEE-ACM Trans Comput Biol Bioinf 9:467–475
Nanni L, Brahnam S, Lumini A (2014) Prediction of protein structure classes by incorporating different protein descriptors into general Chou’s pseudo amino acid composition. J Theor Biol 360:109–116
Ning Q, Ma Z, Zhao X (2019) dForml(KNN)-PseAAC: detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou’s 5-step rule and pseudo components. J Theor Biol 470:43–49
Niu XH, Hu XH, Shi F, Xia JB (2012) Predicting protein solubility by the general form of Chou’s pseudo amino acid composition: approached from Chaos Game representation and fractal dimension. Protein Pept Lett 19:940–948
Niu B, Liang C, Lu Y, Zhao M, Chen Q, Zhang Y, Zheng L (2019) Glioma stages prediction based on machine learning algorithm combined with protein-protein interaction networks. Genomics. https://doi.org/10.1016/j.ygeno.2019.05.024get
OuYang B, Xie S, Berardi MJ, Zhao XM, Dev J, Yu W, Sun B, Chou JJ (2013) Unusual architecture of the p7 channel from hepatitis C virus. Nature 498:521–525
Oxenoid K, Chou JJ (2005) The structure of phospholamban pentamer reveals a channel-like architecture in membranes. Proc Natl Acad Sci USA 102:10870–10875
Oxenoid K, Dong YS, Cao C, Cui T, Sancak Y, Markhard AL, Grabarek Z, Kong L, Liu Z, Ouyang B, Cong Y, Mootha VK, Chou JJ (2016) Architecture of the mitochondrial calcium uniporter. Nature 533:269–273
Pacharawongsakda E, Theeramunkong T (2013) Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of Chou’s PseAAC. IEEE Trans Nanobiosci 12:311–320
Pan L, Fu TM, Zhao W, Zhao L, Chen W, Qiu C, Liu W, Liu Z, Piai A, Fu Q, Chen S, Wu H, Chou JJ (2019a) Higher-order clustering of the transmembrane anchor of DR5 drives signaling. Cell 176:1477–1489
Pan Y, Wang S, Zhang Q, Lu Q, Su D, Zuo Y, Yang L (2019b) Analysis and prediction of animal toxins by various Chou’s pseudo components and reduced amino acid compositions. J Theor Biol 462:221–229
Piai A, Dev J, Fu Q, Chou JJ (2017) Stability and water accessibility of the trimeric membrane anchors of the HIV-1 envelope spikes. J Am Chem Soc 139:18432–18435
Qin YF, Wang CH, Yu XQ, Zhu J, Liu TG, Zheng XQ (2012) Predicting protein structural class by incorporating patterns of over- represented k-mers into the general form of Chou’s PseAAC. Protein Pept Lett 19:388–397
Qin YF, Zheng L, Huang J (2013) Locating apoptosis proteins by incorporating the signal peptide cleavage sites into the general form of Chou’s Pseudo amino acid composition. Int J Quantum Chem 113:1660–1667
Qiu WR, Xiao X (2014) iRSpot-TNCPseAAC: identify recombination spots with trinucleotide composition and pseudo amino acid components. Int J Mol Sci (IJMS) 15:1746–1766
Qiu JD, Huang JH, Liang RP, Lu XQ (2009) Prediction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: an approach from discrete wavelet transform. Anal Biochem 390:68–73
Qiu JD, Huang JH, Shi SP, Liang RP (2010) Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform. Protein Pept Lett 17:715–722
Qiu JD, Suo SB, Sun XY, Shi SP, Liang RP (2011) OligoPred: a web-server for predicting homo-oligomeric proteins by incorporating discrete wavelet transform into Chou’s pseudo amino acid composition. J Mol Graph Model 30:129–134
Qiu WR, Xiao X, Lin WZ (2014) iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. Biomed Res Int (BMRI) 2014:947416
Qiu WR, Xiao X, Lin WZ (2015) iUbiq-Lys: Prediction of lysine ubiquitination sites in proteins by extracting sequence evolution information via a grey system model. J Biomol Struct Dyn (JBSD) 33:1731–1742
Qiu WR, Sun BQ, Xiao X, Xu ZC (2016a) iHyd-PseCp: identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC. Oncotarget 7:44310–44321
Qiu WR, Sun BQ, Xiao X, Xu ZC (2016b) iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics 32:3116–3123
Qiu WR, Xiao X, Xu ZC (2016c) iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget 7:51270–51283
Qiu WR, Jiang SY, Sun BQ, Xiao X, Cheng X (2017a) iRNA-2methyl: identify RNA 2′-O-methylation sites by incorporating sequence-coupled effects into general PseKNC and ensemble classifier. Med Chem 13:734–743
Qiu WR, Jiang SY, Xu ZC, Xiao X (2017b) iRNAm 5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget 8:41178–41188
Qiu WR, Sun BQ, Xiao X, Xu D (2017c) iPhos-PseEvo: identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Mol Inf 36:UNSP 1600010
Qiu WR, Zheng QS, Sun BQ, Xiao X (2017d) Multi-iPPseEvo: a multi-label classifier for identifying human phosphorylated proteins by incorporating evolutionary information into Chou’s general PseAAC via grey system theory. Mol Inform 36:1600085
Qiu WR, Sun BQ, Xiao X, Xu ZC, Jia JH (2018a) iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 110:239–246
Qiu W, Li S, Cui X, Yu Z, Wang M, Du J, Peng Y, Yu B (2018b) Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou’s pseudo-amino acid composition. J Theor Biol 450:86–103
Rahimi M, Bakhtiarizadeh MR, Mohammadi-Sangcheshmeh A (2017) OOgenesis_Pred: a sequence-based method for predicting oogenesis proteins by six different modes of Chou’s pseudo amino acid composition. J Theor Biol 414:128–136
Rahman SM, Shatabda S, Saha S, Kaykobad M, Sohel Rahman M (2018) DPP-PseAAC: a DNA-binding protein prediction model using Chou’s general PseAAC. J Theor Biol 452:22–34
Ren LY, Zhang YS, Gutman I (2012) Predicting the classification of transcription factors by incorporating their binding site properties into a novel mode of Chou’s pseudo amino acid composition. Protein Pept Lett 19:1170–1176
Sabooh MF, Iqbal N, Khan M, Khan M, Maqbool HF (2018) Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC. J Theor Biol 452:1–9
Sahu SS, Panda G (2010) A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction. Comput Biol Chem 34:320–327
Sanchez V, Peinado AM, Perez-Cordoba JL, Gomez AM (2015) A new signal characterization and signal-based Chou’s PseAAC representation of protein sequences. J Bioinform Comput Biol 13:1550024
Sankari ES, Manimegalai DD (2018) Predicting membrane protein types by incorporating a novel feature set into Chou’s general PseAAC. J Theor Biol 455:319–328
Sarangi AN, Lohani M, Aggarwal R (2013) Prediction of essential proteins in prokaryotes by incorporating various physico-chemical features into the general form of Chou’s pseudo amino acid composition. Protein Pept Lett 20:781–795
Schnell JR, Chou JJ (2008) Structure and mechanism of the M2 proton channel of influenza A virus. Nature 451:591–595
Sharma R, Dehzangi A, Lyons J, Paliwal K, Tsunoda T, Sharma A (2015) Predict gram-positive and gram-negative subcellular localization via incorporating evolutionary information and physicochemical features Into Chou’s general PseAAC. IEEE Trans Nanobiosci 14:915–926
Shen HB (2008) PseAAC: a flexible web-server for generating various kinds of protein pseudo amino acid composition. Anal Biochem 373:386–388
Shen HB, Song JN (2009) Prediction of protein folding rates from primary sequence by fusing multiple sequential features. J Biomed Sci Eng (JBiSE) 2:136–143
Shen Y, Tang J, Guo F (2019) Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou’s general PseAAC. J Theor Biol 462:230–239
Song J, Li F, Leier A, Marquez-Lago TT, Akutsu T, Haffari G, Webb GI, Pike RN (2018a) PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics 34:684–687
Song J, Li F, Takemoto K, Haffari G, Akutsu T, Webb GI (2018b) PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural and network features in a machine learning framework. J Theor Biol 443:125–137
Song J, Wang Y, Li F, Akutsu T, Rawlings ND, Webb GI (2018c) iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief Bioinform. https://doi.org/10.1093/bib/bby028
Srivastava A, Kumar R, Kumar M (2018) BlaPred: predicting and classifying beta-lactamase using a 3-tier prediction system via Chou’s general PseAAC. J Theor Biol 457:29–36
Su Q, Lu W, Du D, Chen F, Niu B (2017) Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression. Oncotarget 8:49359–49369
Su ZD, Huang Y, Zhang ZY, Zhao YW, Wang D, Chen W, Lin H (2018) iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics 34:4196–4204
Sun XY, Shi SP, Qiu JD, Suo SB, Huang SY, Liang RP (2012) Identifying protein quaternary structural attributes by incorporating physicochemical properties into the general form of Chou’s PseAAC via discrete wavelet transform. Mol BioSyst 8:3178–3184
Tahir M, Hayat M (2016) iNuc-STNC: a sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou’s PseAAC. Mol BioSyst 12:2587–2593
Tahir M, Hayat M, Kabir M (2017) Sequence based predictor for discrimination of enhancer and their types by applying general form of Chou’s trinucleotide composition. Comput Methods Programs Biomed 146:69–75
Tahir M, Hayat M, Khan SA (2019a) iNuc-ext-PseTNC: an efficient ensemble model for identification of nucleosome positioning by extending the concept of Chou’s PseAAC to pseudo-tri-nucleotide composition. Mol Genet Genomics 294:199–210
Tahir M, Tayara H, Chong KT (2019b) iRNA-PseKNC(2methyl): identify RNA 2′-O-methylation sites by convolution neural network and Chou’s pseudo components. J Theor Biol 465:1–6
Tang H, Chen W, Lin H (2016) Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol BioSyst 12:1269–1275
Tian B, Wu X, Chen C, Qiu W, Ma Q, Yu B (2019) Predicting protein-protein interactions by fusing various Chou’s pseudo components and using wavelet denoising approach. J Theor Biol 462:329–346