Skip to main content

Prediction of Protein Subcellular Locations by Combining K-Local Hyperplane Distance Nearest Neighbor

  • Conference paper
Advanced Data Mining and Applications (ADMA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4632))

Included in the following conference series:

  • 2192 Accesses

Abstract

A huge number of protein sequences have been generated and collected. However, the functions of most of them are still unknown. Protein subcellular localization is important to elucidate protein function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been done to accomplish such a task, there is the need for further research to improve the accuracy of prediction. In this paper, with K-local Hyperplane Distance Nearest Neighbor algorithm (HKNN) as base classifier, an ensemble classifier is proposed to predict the subcellular locations of proteins in eukaryotic cells. Each basic HKNN classifiers are constructed from a separated feature set, and finally combined with majority voting scheme. Results obtained through 5-fold cross-validation test on the same protein dataset showed an improvement in pre-diction accuracy over existing algorithms.

Supported by National Science Foundation of China under grant No. 60603007 and Science and Technology Development Foundation of Shandong Province, China under grant No. 2006GG2201005.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chou, K.C.: Review: prediction of protein structural classes and subcellular locations. Current Protein and Peptide Science 1, 171–208 (2000)

    Article  Google Scholar 

  2. Park, K.J., Kanehisa, M.: Prediction of Protein Subcellular Locations by Support Vector Machines using Compositions of Amino Acids and Amino Acid Pairs. Bioinformatics 19(13), 1656–1663 (2003)

    Article  Google Scholar 

  3. Hua, S.J., Sun, Z.R.: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17(8), 721–728 (2001)

    Article  Google Scholar 

  4. Matsuda, S., et al.: A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Science 14, 2804–2813 (2005)

    Article  Google Scholar 

  5. Cai, Y.D., et al.: Artificial neural network model for predicting protein subcellular location. Computers and Chemistry 26, 179–182 (2002)

    Article  Google Scholar 

  6. Emanuelsson, O., et al.: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Journal of Molecular Biology 300(4), 1005–1016 (2000)

    Article  Google Scholar 

  7. Lu, Z., et al.: Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20(4), 547–556 (2004)

    Article  Google Scholar 

  8. Huang, Y., Li, Y.: Prediction of protein subcellular locations using fuzzy k-NN method. Bioinformatics 20(1), 21–28 (2004)

    Article  Google Scholar 

  9. Nakashima, H., Nishikawa, K.: Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. Journal of Molecular Biology 238(1), 54–61 (1994)

    Article  Google Scholar 

  10. Chou, K.C.: Prediction of protein cellular attributes using pseudo-amino-acid-composition. Proteins 43(3), 246–255 (2001)

    Article  Google Scholar 

  11. Nair, R., Rost, B.: Better prediction of sub-cellular localization by combining evolutionary and structural information. Proteins 53, 917–930 (2003)

    Article  Google Scholar 

  12. Cai, Y.D., et al.: Support vector machines for predicting membrane protein types by using functional domain composition. Biophysical Journal 84(5), 3257–3263 (2003)

    Article  Google Scholar 

  13. Vincent, P., Bengio, Y.: K-local hyperplane and convex distance nearest neighbor algorithms. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) NIPS. Advances in Neural Information Processing Systems, vol. 14, pp. 985–992. MIT Press, Cambridge, MA (2002)

    Google Scholar 

  14. Yang, M.Q., Yang, J.Y.: Identification of Intrinsically Unstructured Regions in Proteins Using Primary Structure. In: Arabnia, H.R., Valafar, H. (eds.) BIOCOMP 2006. Proceedings of the 2006 International Conference on Bioinformatics & Computational Biology, pp. 303–309. CSREA Press (2006)

    Google Scholar 

  15. Freund, Y.: Boosting a weak learning algorithm by majority. Information and computation 121(2), 256–285 (1995)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Liu, H., Feng, H., Zhu, D. (2007). Prediction of Protein Subcellular Locations by Combining K-Local Hyperplane Distance Nearest Neighbor. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73871-8_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73870-1

  • Online ISBN: 978-3-540-73871-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics