Skip to main content

Reliable Retrieval of Top-k Tags

  • Conference paper
  • First Online:
  • 1298 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10569))

Abstract

Collaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top-k sliding average similarity (top-k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r.

Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold.

This is a preview of subscription content, log in via an institution.

References

  1. Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW (2007)

    Google Scholar 

  2. Cattuto, C., Loreto, V., Pietronero, L.: Semiotic dynamics and collaborative tagging. Proc. Nat. Acad. Sci. (2007)

    Google Scholar 

  3. Giannakidou, E., Kompatsiaris, I., Vakali, A.: SEMSOC: SEMantic, social and content-based clustering in multimedia collaborative tagging systems. In: 2008 IEEE International Conference on Semantic Computing, pp. 128–135. IEEE (2008)

    Google Scholar 

  4. Golder, S., Huberman, B.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)

    Article  Google Scholar 

  5. Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: WWW (2007)

    Google Scholar 

  6. Kennedy, L.S., Chang, S.F., Kozintsev, I.V.: To search or to label?: predicting the performance of search-based automatic image classifiers. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (2006)

    Google Scholar 

  7. Kipp, M.E., Campbell, D.G.: Patterns and inconsistencies in collaborative tagging systems: an examination of tagging practices. ASIST (2006)

    Google Scholar 

  8. Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: WWW (2008)

    Google Scholar 

  9. Lipczak, M., Hu, Y., Kollet, Y., Milios, E.: Tag sources for recommendation in collaborative tagging systems. ECML PKDD Discov. Chall. 497, 157–172 (2009)

    Google Scholar 

  10. Marchetti, A., Tesconi, M., Ronzano, F., Rosella, M., Minutoli, S.: Semkey: a semantic collaborative tagging system. In: WWW (2007)

    Google Scholar 

  11. Mathes, A.: Cooperative classification and communication through shared metadata. University of Illinois (2005)

    Google Scholar 

  12. Ramage, D., Heymann, P., Manning, C.D., Garcia-Molina, H.: Clustering the tagged web. In: WSDM (2009)

    Google Scholar 

  13. Wagner, C., Singer, P., Strohmaier, M., Huberman, B.: Semantic stability and implicit consensus in social streams tagging. In: WWW (2014)

    Google Scholar 

  14. Wan, C., Kao, B., Cheung, D.W.: Location-sensitive resources recommendation in social tagging systems. In: CIKM (2012)

    Google Scholar 

  15. Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. TOIS (2010)

    Google Scholar 

  16. Wetzker, R., Zimmermann, C., Bauckhage, C.: Analyzing social bookmarking systems: a del.icio.us cookbook. In: ECAI Mining Social Data Workshop (2008)

    Google Scholar 

  17. Yang, X.S., Cheng, R., Mo, L., Kao, B., Cheung, D.W.: On incentive-based tagging. In: ICDE (2013)

    Google Scholar 

  18. Yi, K.: Harnessing collective intelligence in social tagging using delicious. ASIST (2012)

    Google Scholar 

Download references

Acknowledgement

Xu Yong, Reynold Cheng, and Yudian Zheng were supported by the Research Grants Council of Hong Kong (RGC Projects HKU 17229116 and 17205115) and the University of Hong Kong (Projects 102009508, 104004129, and 201611159247). We would like to thank the reviewers for their insightful comments. We would also like to thank Prof. Wang-Chien Lee (The Pennsylvania States University) for his valuable advice for the initial solution.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reynold Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Xu, Y., Cheng, R., Zheng, Y. (2017). Reliable Retrieval of Top-k Tags. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10569. Springer, Cham. https://doi.org/10.1007/978-3-319-68783-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68783-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68782-7

  • Online ISBN: 978-3-319-68783-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics