Skip to main content

A Method of Text Classification Combining Naive Bayes and the Similarity Computing Algorithms

  • Conference paper
  • First Online:
  • 682 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9461))

Abstract

Text classification is one of the main issues in the big data analysis and research. In present, however, there is a lack of a universal algorithm model that can fulfill the requirement of both accuracy and efficiency of text classification. This paper proposes a method of text classification, which combines the Naive Bayes and the similarity computing algorithm. Firstly, the text information is cut into several word segmentation vectors by the Paoding Analyzer; then the Bayesian algorithm is employed to conduct the first-level directory classification to the text information; after that, the improved similarity computing algorithm is adopted to carry out the second-level directory classification. Finally, the algorithm model is tested with actual data, and the results are compared with those of Bayesian algorithm and similarity computing algorithm respectively. The results show that the proposed method achieves a higher precision rate.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Steinbach, M., Kumar, V.: Introduction To Data Mining. Pand-Ning Tan Press (2010)

    Google Scholar 

  2. Ju, C., Yin, X., Xu, C.: Bayesian classification algorithm of dynamic data stream based on bootstrap. Comput. Eng. Appl. 47(8), 118–121 (2011)

    Google Scholar 

  3. Mitchell, T.M.: Machine Learning, pp. 112–143. Machine Press, Beijing (2003). (Translated by Zeng, H., Zhang, Y., et al.)

    Google Scholar 

  4. Hao, Z., He, L., Chen, B., Yang, X.: A linear support higher-order tensor machine for classification. IEEE Trans. Image Process. 22(7), 2911–2920 (2013)

    Article  Google Scholar 

  5. Cai, R., Zhang, Z., Hao, Z.: BASSUM: a Bayesian semi-supervised method for classification feature selection. Pattern Recogn. 44(4), 811–820 (2011)

    Article  MATH  Google Scholar 

  6. Hao, Z., Cheng, J., Cai, R., Wen, W., Wang, L.: Chinese sentiment classification based on the sentiment drop point. In: Huang, D.-S., Gupta, P., Wang, L., Gromiha, M. (eds.) ICIC 2013. CCIS, vol. 375, pp. 55–60. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Hao, Z., He, L., Chen, B., Yang, X.: A linear support higher-order tensor machine for classification. IEEE Trans. Image Process. 22(7), 2911–2920 (2013)

    Article  Google Scholar 

  8. Yufeng, D., Zhenzhen, H., Fei, J., et al.: Study on semantic markup of species description text in chinese based on auto-learning rules. New Technol. Libr. Inf. Serv. 5, 41–47 (2012)

    Google Scholar 

  9. http://www.360doc.com/content/13/0809/13/891660_305827106.shtml

  10. http://www.cnblogs.com/leoo2sk/archive/2010/09/17/naive-bayesian-classifier.html

Download references

Acknowledgements

This work was supported by Science and Technology Planning Project of Guangdong Province, China (2015A030401101), (2012B040500034).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yinghan Hong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hong, Y., Mai, G., Zeng, H., Guo, C. (2015). A Method of Text Classification Combining Naive Bayes and the Similarity Computing Algorithms. In: Cai, R., Chen, K., Hong, L., Yang, X., Zhang, R., Zou, L. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9461. Springer, Cham. https://doi.org/10.1007/978-3-319-28121-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28121-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28120-9

  • Online ISBN: 978-3-319-28121-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics