Skip to main content

21st Century Search and Recommendation: Exploiting Personalisation and Social Media

  • Chapter
Professional Search in the Modern World

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8830))

  • 835 Accesses

Abstract

Using the Internet to find information and interesting content is now one of the most common tasks performed on a computer. Up until recently, search algorithms returned only one-size-fits-all rankings, resulting in very poor performance for ambiguous search queries. Recent work has demonstrated that contextual information - such as the interests of the searcher - can be utilised to provide more accurate results which have been “personalised” and adapted to the user’s current information need and situation. Likewise, information about the user can be brought to bear to mitigate the problem of information overload and filter content so that users are only shown items they are likely to be interested in.

In this book chapter we explore new methods for assisting users to find the information they want by reducing the complexity of the search task through personalisation. We explore this problem from the perspective of web search and then by considering a very common form of new socially-generated data - microblogs. We first tackle the problem of search result personalisation in the face of extremely sparse and noisy data from a query log. We describe a novel approach which uses query logs to build personalised ranking models in which user profiles are constructed based on the representation of clicked documents over a topic space. Our experiments show that this model can provide personalised ranked lists of documents which improve significantly over a non-personalised baseline. Further examination shows that the performance of the personalised system is particularly good in cases where prior knowledge of the search query is limited.

We then turn our attention to the related problem of recommendation (where the user profile is itself the query) and, more specifically, discuss the possibility of learning user interests from social media data (specifically micro blog posts). We present a short introduction to early work focussing on the difficult task of making use of this vast array of ever-changing data. We demonstrate via experiment that our methods are able to predict, with a high level of precision, which posts will be of interest to users and comment on possibilities for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adomavicius, G., Tuzhilin, A.: Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734–749 (2005)

    Article  Google Scholar 

  2. Bennett, P.N., Radlinski, F., White, R.W., Yilmaz, E.: Inferring and using location metadata to personalize web search. In: 34th ACM Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 135–144. ACM (2011)

    Google Scholar 

  3. Bennett, P.N., White, R.W., Chu, W., Dumais, S.T., Bailey, P., Borisyuk, F., Cui, X.: Modeling the impact of short- and long-term behavior on search personalization. In: 35th ACM Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 185–194. ACM (2012)

    Google Scholar 

  4. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In: 43rd Hawaii Conference on System Sciences, HICSS 2010, pp. 1–10. IEEE Computer Society (2010)

    Google Scholar 

  6. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  7. Brin, S., Page The, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)

    Article  Google Scholar 

  8. Cambazoglu, B., Junqueira, F., Plachouras, V., Banachowski, S., Cui, B., Lim, S., Bridge, B.: A Refreshing Perspective of Search Engine Caching. In: 19th International Conference on World Wide Web, WWW 2010, pp. 181–190. ACM (2010)

    Google Scholar 

  9. Cao, H., Hu, D.H., Shen, D., Jiang, D., Sun, J.-T., Chen, E., Yang, Q.: Context-aware query classification. In: 32nd ACM Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2009)

    Google Scholar 

  10. Carman, M.J., Baillie, M., Gwadera, R., Crestani, F.: A statistical comparison of tag and query logs. In: 32nd ACM Conference on Research and development in Information Retrieval, SIGIR 2009, pp. 123–130. ACM (2009)

    Google Scholar 

  11. Carman, M.J., Crestani, F., Harvey, M., Baillie, M.: Towards query log based personalization using topic models. In: 19th ACM Conference on Information and Knowledge Management, CIKM 2010, pp. 1849–1852. ACM (2010)

    Google Scholar 

  12. Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: 29th ACM Conference on Research and Development in Information Retrieval, SIGIR 2006, pp. 390–397. ACM (2006)

    Google Scholar 

  13. Caruana, R., Niculescu-Mizil, A.: An Empirical Comparison of Supervised Learning Algorithms. In: 23rd International Conference on Machine Learning, ICML 2006, pp. 161–168. ACM (2006)

    Google Scholar 

  14. Chirita, P.A., Nejdl, W., Paiu, R., Kohlschütter, C.: Using odp metadata to personalize search. In: 28th ACM Conference on Research and development in Information Retrieval, SIGIR 2005, pp. 178–185. ACM (2005)

    Google Scholar 

  15. Collins-Thompson, K., Bennett, P.N., White, R.W., de la Chica, S., Sontag, D.: Personalizing web search results by reading level. In: 20th ACM Conference on Information and Knowledge Management, CIKM 2011, pp. 403–412. ACM (2011)

    Google Scholar 

  16. Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice. Addison-Wesley Publishing Company, USA (2009)

    Google Scholar 

  17. Daoud, M., Tamine-Lechani, L., Boughanem, M., Chebaro, B.: A session based personalized search using an ontological user profile. In: 2009 ACM symposium on Applied Computing, SAC 2009, pp. 1732–1736. ACM (2009)

    Google Scholar 

  18. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)

    Article  Google Scholar 

  19. Dou, Z., Song, R., Wen, J.-R.: A large-scale evaluation and analysis of personalized search strategies. In: 16th Conference on World Wide Web, WWW 2007, pp. 581–590. ACM (2007)

    Google Scholar 

  20. Elsweiler, D., Harvey, M.: Engaging and maintaing a sense of being informed: Understanding the tasks motivating twitter search. Journal of the American Society for Information Science and Technology, JASIST (2014)

    Google Scholar 

  21. Falush, D., Stephens, M., Pritchard, J.K.: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4), 1567–1587 (2003)

    Google Scholar 

  22. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531. IEEE (2005)

    Google Scholar 

  23. Furnas, G., Landauer, T., Gomez, L., Dumais, S.: The Vocabulary Problem in Human-System Communicatio. Communications of the ACM 30(11), 964–971 (1987)

    Article  Google Scholar 

  24. Gauch, S., Chaffee, J., Pretschner, A.: Ontology-based user profiles for search and browsing. Web Intelligence and Agent Systems 1(3-4), 219–234 (2003)

    Google Scholar 

  25. Golder, S., Huberman, B.: The structure of collaborative tagging systems. Journal of Information Science 32(2), 198–208 (2005)

    Article  Google Scholar 

  26. Griffiths, T., Steyvers, M.: Finding scientific topics. National Academy of Science 101, 5228–5235 (2004)

    Article  Google Scholar 

  27. Harvey, M., Carman, M., Ruthven, I.: Improving social bookmark search using personalised latent variable language models. In: 4th ACM Conference on Web Search and Data Mining, WSDM 2011, pp. 485–494. ACM (2011)

    Google Scholar 

  28. Harvey, M., Carman, M., Elsweiler, D.: Comparing tweets and tags for urls. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 73–84. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  29. Harvey, M., Ruthven, I., Crestani, F., Carman, M.: Bayesian latent variable models for collaborative item rating prediction. In: 20th ACM Conference on Information and Knowledge Management, CIKM 2011, pp. 699–708. ACM (2011)

    Google Scholar 

  30. Hofmann, T.: Probabilistic latent semantic indexing. In: 22nd ACM Conference on Research and Development in Information Retrieval, SIGIR 1999, pp. 50–57. ACM (1999)

    Google Scholar 

  31. Huberman, B.A., Romero, D.M., Wu, F.: Social networks that matter: Twitter under the microscope. First Monday 14(1) (2009)

    Google Scholar 

  32. Hurlock, J., Wilson, M.L.: Searching twitter: Separating the tweet from the chaff. In: 5th AAAI Conference on Weblogs and Social Media, ICWSM 2011. AAAI (2011)

    Google Scholar 

  33. Jansen, B.J., Spink, A.: How are we searching the World Wide Web? A comparison of nine search engine transaction logs. Information Processing and Management (IPM) 42, 248–263 (2006)

    Article  Google Scholar 

  34. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web mining and Social Network Analysis, pp. 56–65 (2007)

    Google Scholar 

  35. Lawrence, S.: Context in Web Search. IEEE Data Engineering Bulletin 23, 25–32 (2000)

    Google Scholar 

  36. Ma, Z., Pant, G., Sheng, O.R.L.: Interest-based personalized search. ACM Transactions on Information Systems 25(1) (February 2007)

    Google Scholar 

  37. Matthijs, N., Radlinski, F.: Personalizing web search using long term browsing history. In: 4th ACM Conference on Web Search and Data Mining, WSDM 2011, pp. 25–34. ACM (2011)

    Google Scholar 

  38. McFedries, P.: Technically speaking: All a-twitter. IEEE Spectrum 44(10), 84–84 (2007)

    Article  Google Scholar 

  39. Melucci, M.: A basis for information retrieval in context. ACM Transactions of Information Systems 26(3), 14:1–14:41 (2008)

    Google Scholar 

  40. Nakamura, S., Konishi, S., Jatowt, A., Ohshima, H., Kondo, H., Tezuka, T., Oyama, S., Tanaka, K.: Trustworthiness analysis of web search results. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 38–49. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  41. Purcell, K.: Search and email still top the list of most popular online activities. Pew Internet Center (August 2011)

    Google Scholar 

  42. Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: 15th Conference on World Wide Web, WWW 2006, pp. 727–736. ACM (2006)

    Google Scholar 

  43. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW), pp. 175–186 (1994)

    Google Scholar 

  44. Robertson, S.E.: The Probability Ranking Principle in IR. In: Readings in Information Retrieval, pp. 281–286. Morgan Kaufmann Publishers Inc. (1997)

    Google Scholar 

  45. Sieg, A., Mobasher, B., Burke, R.: Web search personalization with ontological user profiles. In: 16th ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 525–534. ACM (2007)

    Google Scholar 

  46. Soboroff, I., McCullough, D., Macdonald, C., Ounis, I., McCreadie, R.: Evaluating real-time search over tweets. In: 6th AAAI Conference on Weblogs and Social Media, ICWSM 2012. AAAI (2012)

    Google Scholar 

  47. Teevan, J., Dumais, S.T., Horvitz, E.: Potential for personalization. ACM Transactions on Computer-Human Interaction 17(1), 4:1–4:31 (2010)

    Google Scholar 

  48. Teevan, J., Dumais, S.T., Liebling, D.J.: To personalize or not to personalize: Modeling queries with variation in user intent. In: 31st ACM Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 163–170. ACM (2008)

    Google Scholar 

  49. Wei, X., Croft, W.: Lda-based document models for ad-hoc retrieval. In: 29th ACM Conference on Research and Development in Information Retrieval, SIGIR 2006, pp. 178–185. ACM (2006)

    Google Scholar 

  50. White, R.W., Bailey, P., Chen, L.: Predicting user interests from contextual information. In: 32nd ACM Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 363–370. ACM (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Harvey, M., Crestani, F. (2014). 21st Century Search and Recommendation: Exploiting Personalisation and Social Media. In: Paltoglou, G., Loizides, F., Hansen, P. (eds) Professional Search in the Modern World. Lecture Notes in Computer Science, vol 8830. Springer, Cham. https://doi.org/10.1007/978-3-319-12511-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12511-4_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12510-7

  • Online ISBN: 978-3-319-12511-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics