Abstract
Efficient and precise motif extraction is a central problem in the study of proteins functions and structures. This paper presents an efficient new geometric approach to the problem, based on the General Hough Transform. The approach is both an extension and a variation of the Secondary Structure Co-Occurrences algorithm by Cantoni et al. [1-2]. The goal is to provide an effective and efficient implementation, suitable for HPC. The most significant contribution of this paper is the introduction of a heuristic greedy variant of the algorithm, which is able to reduce computational time by two orders of magnitude. A secondary effect of the new version is the capability to cope with uncertainty in the geometric description of the secondary structures.
Chapter PDF
Similar content being viewed by others
Keywords
References
Cantoni, V., Ferone, A., Ozbudak, O., Petrosino, A.: Structural analysis of protein secondary structure by GHT. In: 21st International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15, pp. 1767–1770. IEEE Computer Society Press (2012)
Cantoni, V., Ferone, A., Ozbudak, O., Petrosino, A.: Motif Retrieval by Exhaustive Matching and Couple Co-occurrences. In: 9th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2012, Texas, July 12-14 (2012)
Ferretti, M., Musci, M.: Entire Motifs Search of Secondary Structures in Proteins: A Parallelization Study. In: International Workshop on Parallelism in Bioinformatics EUROMPI 2013, Madrid, Spain, September 17 (in printing, 2013)
Protein Data Bank, http://www.rcsb.org/pdb
Ballard, D.: Generalizing the Hough Transform to Detect Arbitrary Shapes. Pattern Recognition 13(2), 111–122 (1981)
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Knuth, D.E.: Generating All Combinations and Partitions. In: The Art of Computer Programming, vol. 4, Fascicle 3, pp. 5–6. Addison-Wesley (2005)
Konc, J., Janežič, D.: An improved branch and bound algorithm for the maximum clique problem. MATCH Communications in Mathematical and in Computer Chemistry 58(3), 569–590 (2007)
Cantoni, V., Ferone, A., Ozbudak, O., Petrosino, A.: Protein motifs retrieval by SS terns occurrences. Pattern Recognition Letters 34, 559–563 (2012)
Structural Classification of Proteins and ASTRAL (January 2013), scop.berkeley.edu
CINECA supercomputing center (9th in top500.org as of May 2013), http://www.cineca.it
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Drago, G., Ferretti, M., Musci, M. (2013). CCMS: A Greedy Approach to Motif Extraction. In: Petrosino, A., Maddalena, L., Pala, P. (eds) New Trends in Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8158. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41190-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-41190-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41189-2
Online ISBN: 978-3-642-41190-8
eBook Packages: Computer ScienceComputer Science (R0)