Skip to main content

An Improved Feature Selection for Categorization Based on Mutual Information

  • Conference paper
  • 984 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5854))

Abstract

The feature reduction is one of the core techniques in text categorization. But there is no consideration of text position factor to the differentiation of labeling text capability in the method of weighting basing on multi-information (MI) in features. So in this paper, we put forward an improved feature selection method that based on MI. By adding the amending parameters in different positions, we have increased the using efficiency about the character information. The result of experiment shows that this method has improved the accuracy of the text classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. De Villiers, G., Linford Vogt, P., De Wit, P.: Business Logistics Management. Oxford University Press, Oxford (2002)

    Google Scholar 

  2. Sheng, Y., Jun, G.: Feature selection based on mutual information and redundancy-synergy coefficient. Journal of Zhejiang University Science A 5(11), 1382–1391 (2004)

    Article  Google Scholar 

  3. Huan, L., Lei, Y.: Toward Integrating Feature Selection Algorithms for classification and Clustering. IEEE Transactions on Knowledge and Data Engineering 17(5), 491–502 (2005)

    Article  Google Scholar 

  4. Qian, Z., Ming-sheng, Z., Wen, H.: Study on Feature Selection in Chinese Text Categorization. Journal of Chinese Information Processing 18(3), 17–23 (2004)

    Google Scholar 

  5. Hai-feng, L., Yuan-yuan, W., Ze-qing, Y., et al.: A Research of Text Categorization Model Based on Feature Clustering. Journal of the China society for scientific and technical information 27(2), 224–228 (2008)

    Google Scholar 

  6. Wenqian, S., Houkuan, H., Haibin, Z., et al.: A novel feature selection algorithm for text categorization. Expert Systems with Applications 33(1), 1–5 (2007)

    Article  Google Scholar 

  7. Guo-ju, S., jie, Z.: An Evaluation of Feature Selection Methods for Text Categorization. Journal of Harbin University of Science and Technology 10(1), 76–78 (2005)

    Google Scholar 

  8. Qi-yu, Z.: Basic of information and philology. Wu Han University publishing company, Wuhan (1997)

    Google Scholar 

  9. Han-qing, H., Cheng-zhi, Z., Hong, Z.: Research On the Weighting of Indexing Sources for Web Concept Mining. Journal of the China society for scientific and technical information 24(1), 87–92 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, H., Su, Z., Yao, Z., Liu, S. (2009). An Improved Feature Selection for Categorization Based on Mutual Information. In: Liu, W., Luo, X., Wang, F.L., Lei, J. (eds) Web Information Systems and Mining. WISM 2009. Lecture Notes in Computer Science, vol 5854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05250-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05250-7_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05249-1

  • Online ISBN: 978-3-642-05250-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics