Abstract
Chinese information processing is a tedious and massive information processing engineering, Chinese word processing is that the whole project-based and one among the important aspects. This paper provides a word segmentation method based on special identifiers, and realizes a word segmentation system by combining the special identifier set with the modified two-character dictionary structure, before it carries out the comparison test for that system and other word segmentation systems by SOUGOU training corpus’s test text.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zou, H.-s., Wu, Y., Wu, Y.-z., Chen, C.: The Key Techniques of Chinese Information Processing in Chinese earch Engine. Application Research of Computers (2000)
Yan, W.-M., Wu, W.-M.: Data Structure (version of C Language). Tsinghua University Press, Beijing (2000)
Wu, J.-j., Jin, X.-w., Lie, X.-f., Wang, P.-j.: Fast dictionary mechanism for Chinese word segmentation. Journal of the Graduate School of the Chinese Academy of Sciences (September 2009)
Zhang, K.: Multi-hash indexing algorism for Chinese character segmentation. Computer Engineering & Design 28 (April 2007)
Zhang, C., Hao, T.: The State of the Art and Difficulties in Automatic Chinese Word Segmentation. Journal of System Simulation (January 2005)
Cui, H.: Research 0n an improved Chinese segmentation algorithm based on word frequency statistic. Information Technology (April 2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qun, Z., Yu, C. (2011). Research on Chinese Word Segmentation Algorithm Based on Special Identifiers. In: Wu, Y. (eds) Computing and Intelligent Systems. ICCIC 2011. Communications in Computer and Information Science, vol 233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24010-2_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-24010-2_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24009-6
Online ISBN: 978-3-642-24010-2
eBook Packages: Computer ScienceComputer Science (R0)