Abstract
This article describes a method that use some context information terms in text clustering base on oral conversation corpus. And we used various distance measurement in the SOM algorithm experiment and the K-means algorithm experiment to test it. The experimental results showed us the context information terms take effect on text clustering, because of its high occurrence frequency. And we found that Hamming distance measurement is the best choice in SOM algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ampazis, N., Perantonis, S.J.: LSISOM – A Latent Semantic Indexing Approach to Self-Organizing Maps of Document Collections. Neural Processing Letters 19, 157–173 (2004)
Bolasco, S., Canzonetti, A., Capo, F.M., et al.: Understanding Text Mining: A Pragmatic Approach. In: Sirmakessis, S. (ed.) Knowledge Mining. StudFuzz, vol. 185, pp. 31–50. Springer, Heidelberg (2005)
Honkela, T.: Self Organizing Map. In: Natural Language Processing. PhD thesis. Helsinki University of Technology, Neural Networks Research Centre, Helsinki
Kohonen, T., Kaski, S., Lagus, K., et al.: Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks 11(3) (May 2000)
Kohonen, T., Makisara, K.: The Self-organizing Feature Maps. Physica Scripta 39, 168–172 (1989)
Lagus, K., Honkela, T., Kaski, S.: WEBSOM for Textual Data Mining. Artificial Intelligence Review 13, 345–364 (1999)
Lagus, K., Kaski, S., Kohonen, T.: Mining massive document collections by the WEBSOM method. Information Sciences 163, 135–156 (2004)
Merkl, D.: Text classiÞcation with self-organizing maps:Some lessons learned. Neuro Computing 21, 61–77 (1998)
Ritter, H., Kohonen, T.: Self-Organizing Semantic Maps. Biological Cybernetics 61, 241–254 (1989)
Zeng, W.: Multi-perspectives on Pragmatics. ZheJiang University Press, Hangzhou (2009)
Zhou, Z.: Quality Evaluation of Text Clustering Results and Investigation on Text Representation. M.E.thesis. Institute of Computing Technology Chinese Academy of Sciences (2005)
索振羽.è¯ç”¨å¦æ•™ç¨‹.北京.北京大å¦å‡ºç‰ˆç¤¾ (2000)
Function Documentation of Matlab, http://www.mathworks.cn/help/toolbox/stats/pdist.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, D., Jiang, M. (2012). Text Clustering on Oral Conversation Corpus. In: Ding, W., Jiang, H., Ali, M., Li, M. (eds) Modern Advances in Intelligent Systems and Tools. Studies in Computational Intelligence, vol 431. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30732-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-30732-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30731-7
Online ISBN: 978-3-642-30732-4
eBook Packages: EngineeringEngineering (R0)