Skip to main content
Log in

In & out zooming on time-aware user/tag clusters

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The common ground behind most approaches that analyze social tagging systems is addressing the information challenge that emerges from the massive activity of millions of users who interact and share resources and/or metadata online. However, lack of any time-related data in the analysis process implicitly denies much of the dynamic nature of social tagging activity. In this paper we claim that holding a temporal dimension, allows for tracking macroscopic and microscopic users’ interests, detecting emerging trends and recognizing events. To this end, we propose a time-aware co-clustering approach for acquiring semantic and temporal patterns out of the tagging activity. The resulted clusters contain both users and tags of similar patterns over time, and reveal non-obvious or “hidden” relations among users and topics of their common interest. Zoom in & out views serve as visualization methods on different aspects of the clusters’ structure, in order to evaluate the efficiency of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. As tag similarity refers mostly to tags’ conceptual similarity, we will pertain to tag-based similarity as semantic similarity, throughout the article.

  2. The potential applications of tracking such clusters are discussed in the next section.

  3. This is not imprinted in Fig. 3 for clarity reasons.

  4. Each time frame corresponds to 10 days duration, at the particular time scale.

  5. Each time frame corresponds to 1 day duration, at the particular time scale.

  6. Although the approach in Becker et al. (2010) is more related to our approach than the other ones presented, still there cannot be a direct comparison between the two methods, since the one mines resources, whereas the other groups together tags and users.

References

  • Allan, J. (2002). Introduction to topic detection and tracking. In Topic detection and tracking: Event-based information organization (pp. 1–16). Norwell: Kluwer Academic.

    Google Scholar 

  • Andrews, D. F. (1972). Plots of high-dimensional data. In Biometrics (Vol. 28, pp. 125–136). Alexandria: International Biometric Society.

    Google Scholar 

  • Angeletou, S., Sabou, M., & Motta, E. (2008). Semantically enriching folksonomies with flor. In Proceedings of the 5th ESWC workshop: Collective Intelligence and the Semantic Web.

  • Becker, H., Naaman, M., Gravano, L. (2010). Learning similarity metrics for event identification in social media. In WSDM ’10: Proceedings of the third ACM international conference on Web search and data mining (pp. 291–300). New York: ACM.

    Chapter  Google Scholar 

  • Begelman, G., Keller, P., & Smadja, F. (2006). Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the collaborative Web tagging workshop, 15th international World Wide Web conference (WWW’06) (pp. 89–98). Endinburgh, Scotland.

  • Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN. Systems, 30, 107–117.

    Article  Google Scholar 

  • Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. San Francisco: ACM.

    Google Scholar 

  • Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., & Tomkins, A. (2006). Visualizing tags over time. In Proceedings of the 15th international conference on World Wide Web (pp. 193–202). Edinburgh: ACM.

    Chapter  Google Scholar 

  • Fayyad, U. M., Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In IJCAI’93 (pp. 1022–1029).

  • Fellbaum, C. (1998). WordNet, an electronic lexical database. Cambridge: MIT Press.

    Google Scholar 

  • Giannakidou, E., Koutsonikola, V., Vakali, A., & Kompatsiaris, I. (2008). Co-clustering tags and social data sources. In Proceedings of the 9th international conference on Web-age information management, China (pp. 317–324).

  • Heymann, P., Koutrika, G., & Garcia-Molina, H. (2007). Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing, 11(6), 36–45.

    Article  Google Scholar 

  • Hotho, A., Jaschke, R., Schmitz, C., & Stumme, G. (2006a). Information retrieval in folksonomies: Search and ranking. In Proceedings of the 3rd European Semantic Web conference, LNCS (Vol. 4011, pp. 411–426). Budva: Springer.

    Google Scholar 

  • Hotho, A., Jaschke, R., Schmitz, C., & Stumme, G. (2006b). Trend detection in folksonomies. In Proceedings of the 1st international conference on semantics and digital media technology (Vol. 4306, pp. 56–70). Athens, Greece.

  • Kleinberg, J. (2006). Temporal dynamics of on-line information streams. In M. Garofalakis, J. Gehrke, & R. Rastogi (Eds.), Data stream management: Processing high-speed data streams. Springer.

  • Koutsonikola, V., Petridou, S., Vakali, A., Hacid, H., & Benatallah, B. (2008). Correlating time-related data sources with co-clustering. In Proceedings of the 9th international conference on Web information systems engineering (pp. 264–279). Auckland: Springer.

    Google Scholar 

  • Koutsonikola, V., Vakali, A., Giannakidou, E., & Kompatsiaris, I. (2009). Clustering of social tagging system users: A topic and time based approach. In 10th int. conf. WISE (Vol. 5802, pp 75–86). Berlin: Springer.

    Google Scholar 

  • Kulldorff, M. (1999). Spatial scan statistics: Models, calculations and applications. In J. Glaz & N. Balakrishnan (Eds.), Recent advances on scan statistics and applications (pp. 303–322).

  • Larsen, B., & Aone, C. (1999). Fast and effective text mining using linear-time document clustering. In KDD ’99: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 16–22). New York: ACM.

    Chapter  Google Scholar 

  • Nanopoulos, A., Gabriel, H., & Spiliopoulou, M. (2009). Spectral clustering in social-tagging systems. In 10th int. conf. on Web information systems engineering (pp. 87–100).

  • Petridou, S. G., Koutsonikola, V. A., Vakali, A. I., & Papadimitriou, G. I. (2008). Time-aware web users’ clustering. IEEE Transactions on Knowledge and Data Engineering, 20, 653–667.

    Article  Google Scholar 

  • Porter, M. F. (1997). An algorithm for suffix stripping. In Readings in information retrieval (pp. 313–316). San Francisco: Morgan Kaufmann Publishers Inc.

    Google Scholar 

  • Rattenbury, T., Good, N., & Naaman, M. (2007). Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (pp. 103–110). New York: ACM.

    Chapter  Google Scholar 

  • Richeldi, M., & Rossotto, M. (1995). Class-driven statistical discretization of continuous attributes (extended abstract). In ECML’95 (pp. 335–338).

  • Russell, T. (2006). Cloudalicious: Folksonomy over time. In Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries (pp. 364–364). Chapel Hill: ACM.

    Chapter  Google Scholar 

  • Shepitsen, A., Gemmell, J., Mobasher, B., & Burke, R. (2008). Personalized recommendation in social tagging systems using hierarchical clustering. In Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08 (pp. 259–266). Lausanne: ACM.

    Chapter  Google Scholar 

  • Sigurbjornsson, B, & van Zwol, R. (2008). Flickr tag recommendation based on collective knowledge. In Proceeding of the 17th international conference on World Wide Web (pp. 327–336). Beijing: ACM.

    Chapter  Google Scholar 

  • Specia, L., & Motta, E. (2007). Integrating folksonomies with the semantic web. In 4th ESWC (pp. 624–639). Austria.

  • Sun, A., Zeng, D., Li, H., & Zheng, X. (2008). Discovering trends in collaborative tagging systems. In Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on intelligence and security informatics (pp. 377–383). Berlin: Springer.

    Google Scholar 

  • Swan, R., & Allan, J. (1999). Extracting significant time varying features from text. In Proceedings of the eighth international conference on information and knowledge management (pp. 38–45). New York: ACM.

    Google Scholar 

  • Theodosiou, T., Angelis, L., Vakali, A., & Thomopoulos, G. (2007). Gene functional annotation by statistical analysis of biomedical articles. International Journal of Medical Informatics, 76(8), 601–613.

    Article  Google Scholar 

  • Vlachos, M., Meek, C., Vagena, Z., & Gunopulos, D. (2004). Identifying similarities, periodicities and bursts for online search queries. In SIGMOD ’04: Proceedings of the 2004 ACM SIGMOD international conference on management of data (pp. 131–142). New York: ACM.

    Chapter  Google Scholar 

  • Wetzker, R., Plumbaum, T., Korth, A., Bauckhage, C., Alpcan, T., & Metze, F. (2008a). Detecting trends in social bookmarking systems using a probabilistic generative model and smoothing. In Proceedings of 19th international conference on pattern recognition (ICPR 2008) (pp. 1–4). Piscataway: IEEE.

    Chapter  Google Scholar 

  • Wetzker, R., Zimmermann, C., & Bauckhage, C. (2008b) Analyzing social bookmarking systems: A del.icio.us cookbook. In Proceedings of the ECAI 2008 mining social data workshop (2008) (pp. 26–30).

  • Wu, E. H., Ng, M. K., Yip, A. M., & Chan, T. F. (2004). Discretization of multidimensional web data for informative dense regions discovery. In Computational and information science (pp. 718–724).

  • Wu, Z., & Palmer, M. (1994). Verm semantics and lexical selection. In Proceedings of the 32nd annual meeting of the Association for Computational Linguistics (pp. 133–138). New Mexico, USA.

  • Zhou, M., Bao, S., Wu, X., & Yu, Y. (2007). An unsupervised model for exploring hierarchical semantics from social annotations. In Proceedings of the 6th international Semantic Web conference, (ISWC ’07) (pp. 680–693). Busan, Korea.

Download references

Acknowledgements

This work was supported by the FP7 project WeKnowIt, partially funded by the EC under contract number 215453.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eirini Giannakidou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Giannakidou, E., Koutsonikola, V., Vakali, A. et al. In & out zooming on time-aware user/tag clusters. J Intell Inf Syst 38, 685–708 (2012). https://doi.org/10.1007/s10844-011-0173-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0173-4

Keywords

Navigation