Abstract
Recently tagging has been a flexible and important way to share and categorize web resources. However, ambiguity and large quantities of tags restrict its value for resource sharing and navigation. Tag clustering could help alleviate these problems by gathering relevant tags. In this paper, we introduce a link-based method to measure the relevance between tags based on random walk on graphs. We also propose a new clustering method which could address several challenges in tag clustering. The experimental results based on del.icio.us show that our methods achieve good accuracy and acceptable performance on tag clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Simpson, E.: Clustering Tags in Enterprise and Web Folksonomies. Technical report, HP Labs (2008)
Newzingo: Your Map to Google News, http://www.newzingo.com
Grigory, B., Philipp, K., Frank, S: Automated Tag Clustering: Improving search and exploration in the tag space. WWW (2006)
Celine, V.D., Martin, H., Katharina, S.: Folksontology: An integrated approach for turning folksomomies into ontology. SemNet, 57–70 (2007)
Leonard, K., Peter, J.R.: Finding Groups in Data: an Introduction to Cluster Analysis. Wiley Interscience, Hoboken (1990)
Martin, E., Hans-Peter, K., Jorg, S., Xiaowei, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: SIGKDD 1996 (1996)
Christopher, H.B., Nancy, M.: Improved Annotation of the Blogopshere via Autotagging and Hierarchical Clustering. WWW (2006)
Glen, J., Jennifer, W.: SimRank: A measure of structural-context similarity. In: SIGKDD, pp. 538–543 (2002)
Kallenberg, O.: Foundations of Modern Probability. Springer, New York (1997)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford University Database Group (1998)
Pei, L., Zhixu, L., Li, H., Jun, H., Xiaoyong, D.: Using Link-Based Content Analysis to Measure Document Similarity Effectively. APWeb/WAIM, 455–467 (2009)
Del.icio.us, http://delicious.com
Tian, Z., Raghu, R., Miron, L.: BIRCH: An Efficient Data Clustering Method for very Large Databases. In: SIGMOD, pp. 103–114 (1996)
Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980), http://www.tartarus.org/~martin/PorterStemmer
The stop-words list, http://members.unine.ch/jacques.savoy/clef/englishST.txt
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cui, J., Li, P., Liu, H., He, J., Du, X. (2009). A Neighborhood Search Method for Link-Based Tag Clustering. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)