Fast SCOP Classification of Structural Class and Fold Using Secondary Structure Mining in Distance Matrix
It is an urgent need to understand the structure-function relationship in proteomic era. One of the important techniques to meet this demand is to analyze and represent the spatial structure of domain which is the functional unit of the whole protein, and perform fast domain classification. In this paper, we introduce a novel method of rapid domain classification. Instead of analyzing directly protein sequence or 3-D tertiary structure, the presented method maps firstly tertiary structure of protein domain into 2-D Cα-Cα distance matrix. Then, two distance functions for alpha helix and beta strand are modeled by considering their geometrical properties respectively. After that, the distance functions are further applied to mine secondary structure elements in such distance matrix with the way similar to image processing. Furthermore, composition feature and arrangement feature of secondary structure elements are presented to characterize domain structure for classification of structural class and fold in Structural Classification of Proteins (SCOP) database. Finally, the results compared with other methods show that the presented method can perform effectively and efficiently automatic classification of domain with the benefit of low dimension and meaningful features, but also no need of complicated classifier system.
KeywordsSCOP classification protein structure distance matrix secondary structure mining image processing support vector machines
- 3.Alison, L.C., Ian, S., Tony, L., Oliver, C.R., Richard, G., Janet, T., Christine, A.: The CATH Classification Revisited–Architectures Reviewed and New Ways to Characterize Structural Divergence in Superfamilies. Nucleic Acids Research 37, D310–D314 (2008)Google Scholar
- 17.Sayre, T., Singh, R.: Protein Structure Comparison and Alignment Using Residue Contexts. In: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications – Workshops, pp. 796–801. IEEE Computer Society, Los Alamitos (2008)Google Scholar
- 19.Shi, J.-Y., Zhang, S.-W., Pan, Q., Liang, Y.: Protein Fold Recognition with Support Vector Machines Fusion Network. Progress in Biochemistry and Biophysics 33, 155–162 (2006)Google Scholar