Mining Social Media for Enhancing Personalized Document Clustering

  • Chin-Sheng YangEmail author
  • Pei-Chun Chang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9191)


Social media is nowadays an excellent platform for gathering user intelligence for supporting business intelligence applications. Social tagging system (aka. folksonomy) is a critical mechanism for collaboratively creating, organizing and managing the wisdom of crowds. The knowledge gained from social tagging system should be tremendous assets for conducting and improving various business intelligent applications. Consequently, the purpose of this study is to examine the values of folksonomy on an important business intelligent task, namely personalized document management. Specifically, we employ Delicious, a pioneered social bookmarking service, to construct a statistical-based thesaurus which is then applied to support personalized document clustering. According to our empirical evaluation results, social tagging system indeed improve the quality of the statistical-based thesaurus in comparison with that constructed on the basis of a general-purpose search engine in generating personalized document clusters.


Social media Business intelligence Social tagging Social bookmarking Personalized document clustering 



This work was supported by the National Science Council of the Republic of China under the grant NSC 100-2410-H-155-013-MY3 and the Ministry of Science and Technology of the Republic of China under the grant MOST 103-2410-H-155-027-MY3.


  1. 1.
    Aliakbary, S., Abolhassani, H., Rahmani, H., Nobakht, B.: Web page classification using social tags. In: International Conference on Computational Science and Engineering, pp. 588–593. IEEE Press, New York (2009)Google Scholar
  2. 2.
    Barreau, D.K.: Context as a factor in personal information management systems. J. Am. Soc. Inform. Sci. 46, 327–339 (1995)CrossRefGoogle Scholar
  3. 3.
    Biancalana, C., Gasparetti, F., Micarelli, A., Sansonetti, G.: Social semantic query expansion. ACM Trans. Intell. Syst. Technol. 4, 60 (2013)Google Scholar
  4. 4.
    Boley, D., Gini, M., Gross, R., Han, E., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., Moore, L.: Partitioning-based clustering for web document categorization. Decis. Support Syst. 27, 329–341 (1999)CrossRefGoogle Scholar
  5. 5.
    Brill, E.: A simple rule-based part of speech tagger. In: Third Conference on Applied Natural Language Processing, pp.152–155. Association for Computational Linguistics, Stroudsburg, PA (1992)Google Scholar
  6. 6.
    Cai, Y., Li, Q., Xie, H., Min, H.: Exploring personalized searches using tag-based user profiles and resource profiles in folksonomy. Neural Netw. 58, 98–110 (2014)CrossRefGoogle Scholar
  7. 7.
    Case, D.O.: Conceptual organization and retrieval of text by historians: the role of memory and metaphor. J. Am. Soc. Inform. Sci. 42, 657–668 (1991)CrossRefGoogle Scholar
  8. 8.
    Cattuto, C., Benz, D., Hotho, A., Stumme, G.: Semantic grounding of tag relatedness in social bookmarking systems. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 615–631. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Cutting, D., Karger, D., Pedersen, J., Tukey, J.: Scatter/gather: a cluster-based approach to browsing large document collections. In: 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329. ACM Press, New York (1992)Google Scholar
  10. 10.
    Deogun, J., Raghavan, V.: User-oriented document clustering: a framework for learning in information retrieval. In: 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 157–163. ACM Press, New York (1986)Google Scholar
  11. 11.
    El-Hamdouchi, A., Willett, P.: Hierarchical document clustering using ward’s method. In: 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 149–156. ACM Press, New York (1986)Google Scholar
  12. 12.
    eMarketer.: US social network users 2013: smartphone usage drives mobile-social growth (2007).
  13. 13.
    Facebook.: Facebook reports third quarter 2014 results (2014).
  14. 14.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31, 265–323 (1999)CrossRefGoogle Scholar
  15. 15.
    Kaplan, A.M., Haenlein, M.: Users of the world, unite! the challenges and opportunities of social media. Bus. Horiz. 53, 59–68 (2010)CrossRefGoogle Scholar
  16. 16.
    Kim, H., Lee, S.: A semi-supervised document clustering technique for information organization. In: 9th International Conference on Information and Knowledge Management, pp. 30–37. ACM Press, New York (2000)Google Scholar
  17. 17.
    Kwasnik, B.H.: The importance of factors that are not document attributes in the organization of personal documents. J. Doc. 47, 389–398 (1991)CrossRefGoogle Scholar
  18. 18.
    Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-organizing maps of document collections: a new approach to interactive exploration. In: 2nd International Conference on Knowledge Discovery and Data Mining, pp. 238–243. AAAI Press, Menlo Park (1996)Google Scholar
  19. 19.
    Larsen, B., Aone, C.: Fast and effective text mining using linear-time document clustering. In: 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 16–22. ACM Press, New York (1999)Google Scholar
  20. 20.
    Lin, C., Chen, H., Nunamaker, J.F.: Verifying the proximity and size hypothesis for self-organizing maps. J. Manage. Inform. Syst. 16, 57–70 (1999–2000)Google Scholar
  21. 21.
    Milicevic, A.K., Nanopoulos, A., Ivanovic, M.: Social tagging in recommender systems: a survey of the state-of-the-art and possible extensions. Artif. Intell. Rev. 33, 187–209 (2010)CrossRefGoogle Scholar
  22. 22.
    Movahedian, H., Khayyambashi, M.R.: Folksonomy-based user interest and disinterest profiling for improved recommendations: an ontological approach. J. Inform. Sci. 40, 594–610 (2014)CrossRefGoogle Scholar
  23. 23.
    Pantel, P., Lin, D.: Document clustering with committees. In: 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 199–206. ACM Press, New York (2002)Google Scholar
  24. 24.
    Pew Research Center: Social networking fact sheet (2014).
  25. 25.
    Roussinov, D.G., Chen, H.: Document clustering for electronic meetings: an experimental comparison of two techniques. Decis. Support Syst. 27, 67–79 (1999)CrossRefGoogle Scholar
  26. 26.
    Rucker, J., Polanco, M.J.: Siteseer: personalized navigation for the web. Commun. ACM 40, 73–75 (1997)CrossRefGoogle Scholar
  27. 27.
    Suchanek, F.M., Vojnovic, M., Gunawardena, D.: Social tags: meaning and suggestions. In: 17th ACM Conference on Information and Knowledge Management, pp. 223–232. ACM Press, New York (2008)Google Scholar
  28. 28.
    Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inform. Syst. 21, 315–346 (2013)CrossRefGoogle Scholar
  29. 29.
    Vander Wal, T.: Folksonomy (2005).
  30. 30.
    Voorhees, E.M.: Implementing agglomerative hierarchical clustering algorithms for use in document retrieval. Inform. Process. Manage. 22, 465–476 (1986)CrossRefGoogle Scholar
  31. 31.
    Voutilainen, A.: Nptool: A detector of english noun phrases. In: Workshop on Very Large Corpora, pp. 48–57 (1993)Google Scholar
  32. 32.
    Wei, C., Chiang, R., Wu, C.: Accommodating individual categorization preferences: a personalized document clustering approach. J. Manage. Inform. Syst. 23, 173–201 (2006)CrossRefzbMATHGoogle Scholar
  33. 33.
    Wei, C., Hu, P., Dong, Y.X.: Managing document categories in e-commerce environments: an evolution-based approach. Eur. J. Inform. Syst. 11, 208–222 (2002)CrossRefGoogle Scholar
  34. 34.
    Wu, C., Zhan, B.: Semantic relatedness in folksonomy. In: International Conference on New Trends in Information and Service Science, pp. 760–765. IEEE Press, New York (2009)Google Scholar
  35. 35.
    Yang, Y., Chute, C.G.: An example-based mapping method for text categorization and retrieval. ACM Trans. Inform. Syst. 12, 252–277 (1994)CrossRefGoogle Scholar
  36. 36.
    Yang, C.S., Wei, C.: Context-aware document-clustering technique. In: 11th Pacific Asia Conference on Information Systems (2007)Google Scholar
  37. 37.
    Yang, C.S., Chen, L.C.: Personalized recommendation in social media: a profile expansion approach. In: 18th Pacific Asia Conference on Information Systems (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Information Management, and Innovation Center for Big Data and Digital ConvergenceYuan Ze UniversityChung-LiTaiwan, ROC

Personalised recommendations