Reliable Retrieval of Top-k Tags

Xu, Yong; Cheng, Reynold; Zheng, Yudian

doi:10.1007/978-3-319-68783-4_23

Reliable Retrieval of Top-k Tags

Yong Xu²⁴,
Reynold Cheng²⁴ &
Yudian Zheng²⁴

Conference paper
First Online: 04 October 2017

1298 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10569))

Abstract

Collaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top-k sliding average similarity (top-k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r.

Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold.

This is a preview of subscription content, log in via an institution.

References

Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW (2007)
Google Scholar
Cattuto, C., Loreto, V., Pietronero, L.: Semiotic dynamics and collaborative tagging. Proc. Nat. Acad. Sci. (2007)
Google Scholar
Giannakidou, E., Kompatsiaris, I., Vakali, A.: SEMSOC: SEMantic, social and content-based clustering in multimedia collaborative tagging systems. In: 2008 IEEE International Conference on Semantic Computing, pp. 128–135. IEEE (2008)
Google Scholar
Golder, S., Huberman, B.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)
Article Google Scholar
Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: WWW (2007)
Google Scholar
Kennedy, L.S., Chang, S.F., Kozintsev, I.V.: To search or to label?: predicting the performance of search-based automatic image classifiers. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (2006)
Google Scholar
Kipp, M.E., Campbell, D.G.: Patterns and inconsistencies in collaborative tagging systems: an examination of tagging practices. ASIST (2006)
Google Scholar
Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW (2008)
Google Scholar
Lipczak, M., Hu, Y., Kollet, Y., Milios, E.: Tag sources for recommendation in collaborative tagging systems. ECML PKDD Discov. Chall. 497, 157–172 (2009)
Google Scholar
Marchetti, A., Tesconi, M., Ronzano, F., Rosella, M., Minutoli, S.: Semkey: a semantic collaborative tagging system. In: WWW (2007)
Google Scholar
Mathes, A.: Cooperative classification and communication through shared metadata. University of Illinois (2005)
Google Scholar
Ramage, D., Heymann, P., Manning, C.D., Garcia-Molina, H.: Clustering the tagged web. In: WSDM (2009)
Google Scholar
Wagner, C., Singer, P., Strohmaier, M., Huberman, B.: Semantic stability and implicit consensus in social streams tagging. In: WWW (2014)
Google Scholar
Wan, C., Kao, B., Cheung, D.W.: Location-sensitive resources recommendation in social tagging systems. In: CIKM (2012)
Google Scholar
Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. TOIS (2010)
Google Scholar
Wetzker, R., Zimmermann, C., Bauckhage, C.: Analyzing social bookmarking systems: a del.icio.us cookbook. In: ECAI Mining Social Data Workshop (2008)
Google Scholar
Yang, X.S., Cheng, R., Mo, L., Kao, B., Cheung, D.W.: On incentive-based tagging. In: ICDE (2013)
Google Scholar
Yi, K.: Harnessing collective intelligence in social tagging using delicious. ASIST (2012)
Google Scholar

Download references

Acknowledgement

Xu Yong, Reynold Cheng, and Yudian Zheng were supported by the Research Grants Council of Hong Kong (RGC Projects HKU 17229116 and 17205115) and the University of Hong Kong (Projects 102009508, 104004129, and 201611159247). We would like to thank the reviewers for their insightful comments. We would also like to thank Prof. Wang-Chien Lee (The Pennsylvania States University) for his valuable advice for the initial solution.

Author information

Authors and Affiliations

The University of Hong Kong, Pok Fu Lam, Hong Kong
Yong Xu, Reynold Cheng & Yudian Zheng

Authors

Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Reynold Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Yudian Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reynold Cheng .

Editor information

Editors and Affiliations

University of Sydney, Darlington, NSW, Australia
Athman Bouguettaya
Zhejiang University, Hangzhou, China
Yunjun Gao
Institute of Computing for Physics and Technology, Protvino, Russia
Andrey Klimenko
Nanyang Technological University, Singapore, Singapore
Lu Chen
King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Xiangliang Zhang
Institute of Computing for Physics and Technology, Protvino, Russia
Fedor Dzerzhinskiy
Shanghai Jiao Tong University, Minhang Qu, China
Weijia Jia
Institute of Computing for Physics and Technology, Protvino, Russia
Stanislav V. Klimenko
City University of Hong Kong, Kowloon, Hong Kong
Qing Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Y., Cheng, R., Zheng, Y. (2017). Reliable Retrieval of Top-k Tags. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10569. Springer, Cham. https://doi.org/10.1007/978-3-319-68783-4_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-68783-4_23
Published: 04 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68782-7
Online ISBN: 978-3-319-68783-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics