Abstract
With the rapid development of Internet, a large number of new words have emerged and widely been used in social network. Traditional segmentation algorithm can’t identify these new words efficiently, which will greatly affect the accuracy in extracting out these hot words and keywords. Moreover, it will affect the performance of the network public opinion monitoring system. In this paper, we use tweets collected from Twitter as the experimental data-set. By calculating frequency statistics of k-gram strings, we can find out new words as candidates, and then identify new words by their practical application frequency using Twitter’s search function. The experiment shows: this segmentation algorithm can effectively identify the new keywords and is more suitable for public opinion monitoring system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Xu, X.-R.: Study on the Way to Solve the Paroxysmal Public Feelings on Internet. Journal of North China Electric Power University (Social Sciences) (1), 89–93 (2007) (in Chinese)
Wei, W., Xin, X.: Online Public Opinion Hotspot Detection and Analysis Based on Document Clustering. New Technology of Library and Information Service (3), 74–79 (2009) (in Chinese)
Du, J., Xiong, H.-L.: Algorithm to recognize unknown Chinese words based on BBS corpus. Computer Engineering and Design 31(3), 630–633 (2010) (in Chinese)
Li, D., Cao, Y.-D., Wan, Y.-L.: Internet-Oriented New Words Identification. Journal of Beijing University of Posts and Telecommunications 31(1), 26–29 (2008) (in Chinese)
Zou, G., Liu, Y., Liu, Q., et al.: Internet oriented Chinese New Words Detection. Journal of Chinese Information Processing 18(6), 1–9 (2004) (in Chinese)
Cui, S.-Q., Liu, Q., et al.: New Word Detection Based on Large- Scale Corpus. Journal of Computer Research and Development 43(5), 927–932 (2006) (in Chinese)
He, M., Gong, C.C., Zhang, H.-P., et al.: Method of new word identification based on larger-scale corpus. Computer Engineering and Applications 43(21), 157–159 (2007) (in Chinese)
Tang, J.-T., Li, F., Guo, C.S.: Research of New Word Pattern Recognization in Network Monitoring Public Opinion. Computer Technology and Development 22(1), 119–125 (2012) (in Chinese)
Trendistic, http://trendistic.com/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xiaoyan, W., kai, X., Ying, S., Jian-long, T., Li, G. (2013). Research of New Words Identification in Social Network for Monitoring Public Opinion. In: Yuan, Y., Wu, X., Lu, Y. (eds) Trustworthy Computing and Services. ISCTCS 2012. Communications in Computer and Information Science, vol 320. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35795-4_75
Download citation
DOI: https://doi.org/10.1007/978-3-642-35795-4_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35794-7
Online ISBN: 978-3-642-35795-4
eBook Packages: Computer ScienceComputer Science (R0)