Abstract
Collaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top-k sliding average similarity (top-k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r.
Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold.
This is a preview of subscription content, log in via an institution.
References
Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW (2007)
Cattuto, C., Loreto, V., Pietronero, L.: Semiotic dynamics and collaborative tagging. Proc. Nat. Acad. Sci. (2007)
Giannakidou, E., Kompatsiaris, I., Vakali, A.: SEMSOC: SEMantic, social and content-based clustering in multimedia collaborative tagging systems. In: 2008 IEEE International Conference on Semantic Computing, pp. 128–135. IEEE (2008)
Golder, S., Huberman, B.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)
Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: WWW (2007)
Kennedy, L.S., Chang, S.F., Kozintsev, I.V.: To search or to label?: predicting the performance of search-based automatic image classifiers. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (2006)
Kipp, M.E., Campbell, D.G.: Patterns and inconsistencies in collaborative tagging systems: an examination of tagging practices. ASIST (2006)
Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW (2008)
Lipczak, M., Hu, Y., Kollet, Y., Milios, E.: Tag sources for recommendation in collaborative tagging systems. ECML PKDD Discov. Chall. 497, 157–172 (2009)
Marchetti, A., Tesconi, M., Ronzano, F., Rosella, M., Minutoli, S.: Semkey: a semantic collaborative tagging system. In: WWW (2007)
Mathes, A.: Cooperative classification and communication through shared metadata. University of Illinois (2005)
Ramage, D., Heymann, P., Manning, C.D., Garcia-Molina, H.: Clustering the tagged web. In: WSDM (2009)
Wagner, C., Singer, P., Strohmaier, M., Huberman, B.: Semantic stability and implicit consensus in social streams tagging. In: WWW (2014)
Wan, C., Kao, B., Cheung, D.W.: Location-sensitive resources recommendation in social tagging systems. In: CIKM (2012)
Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. TOIS (2010)
Wetzker, R., Zimmermann, C., Bauckhage, C.: Analyzing social bookmarking systems: a del.icio.us cookbook. In: ECAI Mining Social Data Workshop (2008)
Yang, X.S., Cheng, R., Mo, L., Kao, B., Cheung, D.W.: On incentive-based tagging. In: ICDE (2013)
Yi, K.: Harnessing collective intelligence in social tagging using delicious. ASIST (2012)
Acknowledgement
Xu Yong, Reynold Cheng, and Yudian Zheng were supported by the Research Grants Council of Hong Kong (RGC Projects HKU 17229116 and 17205115) and the University of Hong Kong (Projects 102009508, 104004129, and 201611159247). We would like to thank the reviewers for their insightful comments. We would also like to thank Prof. Wang-Chien Lee (The Pennsylvania States University) for his valuable advice for the initial solution.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Xu, Y., Cheng, R., Zheng, Y. (2017). Reliable Retrieval of Top-k Tags. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10569. Springer, Cham. https://doi.org/10.1007/978-3-319-68783-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-68783-4_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68782-7
Online ISBN: 978-3-319-68783-4
eBook Packages: Computer ScienceComputer Science (R0)