Abstract
Using the Internet to find information and interesting content is now one of the most common tasks performed on a computer. Up until recently, search algorithms returned only one-size-fits-all rankings, resulting in very poor performance for ambiguous search queries. Recent work has demonstrated that contextual information - such as the interests of the searcher - can be utilised to provide more accurate results which have been “personalised” and adapted to the user’s current information need and situation. Likewise, information about the user can be brought to bear to mitigate the problem of information overload and filter content so that users are only shown items they are likely to be interested in.
In this book chapter we explore new methods for assisting users to find the information they want by reducing the complexity of the search task through personalisation. We explore this problem from the perspective of web search and then by considering a very common form of new socially-generated data - microblogs. We first tackle the problem of search result personalisation in the face of extremely sparse and noisy data from a query log. We describe a novel approach which uses query logs to build personalised ranking models in which user profiles are constructed based on the representation of clicked documents over a topic space. Our experiments show that this model can provide personalised ranked lists of documents which improve significantly over a non-personalised baseline. Further examination shows that the performance of the personalised system is particularly good in cases where prior knowledge of the search query is limited.
We then turn our attention to the related problem of recommendation (where the user profile is itself the query) and, more specifically, discuss the possibility of learning user interests from social media data (specifically micro blog posts). We present a short introduction to early work focussing on the difficult task of making use of this vast array of ever-changing data. We demonstrate via experiment that our methods are able to predict, with a high level of precision, which posts will be of interest to users and comment on possibilities for future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adomavicius, G., Tuzhilin, A.: Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734–749 (2005)
Bennett, P.N., Radlinski, F., White, R.W., Yilmaz, E.: Inferring and using location metadata to personalize web search. In: 34th ACM Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 135–144. ACM (2011)
Bennett, P.N., White, R.W., Chu, W., Dumais, S.T., Bailey, P., Borisyuk, F., Cui, X.: Modeling the impact of short- and long-term behavior on search personalization. In: 35th ACM Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 185–194. ACM (2012)
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In: 43rd Hawaii Conference on System Sciences, HICSS 2010, pp. 1–10. IEEE Computer Society (2010)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Brin, S., Page The, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Cambazoglu, B., Junqueira, F., Plachouras, V., Banachowski, S., Cui, B., Lim, S., Bridge, B.: A Refreshing Perspective of Search Engine Caching. In: 19th International Conference on World Wide Web, WWW 2010, pp. 181–190. ACM (2010)
Cao, H., Hu, D.H., Shen, D., Jiang, D., Sun, J.-T., Chen, E., Yang, Q.: Context-aware query classification. In: 32nd ACM Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2009)
Carman, M.J., Baillie, M., Gwadera, R., Crestani, F.: A statistical comparison of tag and query logs. In: 32nd ACM Conference on Research and development in Information Retrieval, SIGIR 2009, pp. 123–130. ACM (2009)
Carman, M.J., Crestani, F., Harvey, M., Baillie, M.: Towards query log based personalization using topic models. In: 19th ACM Conference on Information and Knowledge Management, CIKM 2010, pp. 1849–1852. ACM (2010)
Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: 29th ACM Conference on Research and Development in Information Retrieval, SIGIR 2006, pp. 390–397. ACM (2006)
Caruana, R., Niculescu-Mizil, A.: An Empirical Comparison of Supervised Learning Algorithms. In: 23rd International Conference on Machine Learning, ICML 2006, pp. 161–168. ACM (2006)
Chirita, P.A., Nejdl, W., Paiu, R., Kohlschütter, C.: Using odp metadata to personalize search. In: 28th ACM Conference on Research and development in Information Retrieval, SIGIR 2005, pp. 178–185. ACM (2005)
Collins-Thompson, K., Bennett, P.N., White, R.W., de la Chica, S., Sontag, D.: Personalizing web search results by reading level. In: 20th ACM Conference on Information and Knowledge Management, CIKM 2011, pp. 403–412. ACM (2011)
Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice. Addison-Wesley Publishing Company, USA (2009)
Daoud, M., Tamine-Lechani, L., Boughanem, M., Chebaro, B.: A session based personalized search using an ontological user profile. In: 2009 ACM symposium on Applied Computing, SAC 2009, pp. 1732–1736. ACM (2009)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Dou, Z., Song, R., Wen, J.-R.: A large-scale evaluation and analysis of personalized search strategies. In: 16th Conference on World Wide Web, WWW 2007, pp. 581–590. ACM (2007)
Elsweiler, D., Harvey, M.: Engaging and maintaing a sense of being informed: Understanding the tasks motivating twitter search. Journal of the American Society for Information Science and Technology, JASIST (2014)
Falush, D., Stephens, M., Pritchard, J.K.: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4), 1567–1587 (2003)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531. IEEE (2005)
Furnas, G., Landauer, T., Gomez, L., Dumais, S.: The Vocabulary Problem in Human-System Communicatio. Communications of the ACM 30(11), 964–971 (1987)
Gauch, S., Chaffee, J., Pretschner, A.: Ontology-based user profiles for search and browsing. Web Intelligence and Agent Systems 1(3-4), 219–234 (2003)
Golder, S., Huberman, B.: The structure of collaborative tagging systems. Journal of Information Science 32(2), 198–208 (2005)
Griffiths, T., Steyvers, M.: Finding scientific topics. National Academy of Science 101, 5228–5235 (2004)
Harvey, M., Carman, M., Ruthven, I.: Improving social bookmark search using personalised latent variable language models. In: 4th ACM Conference on Web Search and Data Mining, WSDM 2011, pp. 485–494. ACM (2011)
Harvey, M., Carman, M., Elsweiler, D.: Comparing tweets and tags for urls. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 73–84. Springer, Heidelberg (2012)
Harvey, M., Ruthven, I., Crestani, F., Carman, M.: Bayesian latent variable models for collaborative item rating prediction. In: 20th ACM Conference on Information and Knowledge Management, CIKM 2011, pp. 699–708. ACM (2011)
Hofmann, T.: Probabilistic latent semantic indexing. In: 22nd ACM Conference on Research and Development in Information Retrieval, SIGIR 1999, pp. 50–57. ACM (1999)
Huberman, B.A., Romero, D.M., Wu, F.: Social networks that matter: Twitter under the microscope. First Monday 14(1) (2009)
Hurlock, J., Wilson, M.L.: Searching twitter: Separating the tweet from the chaff. In: 5th AAAI Conference on Weblogs and Social Media, ICWSM 2011. AAAI (2011)
Jansen, B.J., Spink, A.: How are we searching the World Wide Web? A comparison of nine search engine transaction logs. Information Processing and Management (IPM) 42, 248–263 (2006)
Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web mining and Social Network Analysis, pp. 56–65 (2007)
Lawrence, S.: Context in Web Search. IEEE Data Engineering Bulletin 23, 25–32 (2000)
Ma, Z., Pant, G., Sheng, O.R.L.: Interest-based personalized search. ACM Transactions on Information Systems 25(1) (February 2007)
Matthijs, N., Radlinski, F.: Personalizing web search using long term browsing history. In: 4th ACM Conference on Web Search and Data Mining, WSDM 2011, pp. 25–34. ACM (2011)
McFedries, P.: Technically speaking: All a-twitter. IEEE Spectrum 44(10), 84–84 (2007)
Melucci, M.: A basis for information retrieval in context. ACM Transactions of Information Systems 26(3), 14:1–14:41 (2008)
Nakamura, S., Konishi, S., Jatowt, A., Ohshima, H., Kondo, H., Tezuka, T., Oyama, S., Tanaka, K.: Trustworthiness analysis of web search results. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 38–49. Springer, Heidelberg (2007)
Purcell, K.: Search and email still top the list of most popular online activities. Pew Internet Center (August 2011)
Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: 15th Conference on World Wide Web, WWW 2006, pp. 727–736. ACM (2006)
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW), pp. 175–186 (1994)
Robertson, S.E.: The Probability Ranking Principle in IR. In: Readings in Information Retrieval, pp. 281–286. Morgan Kaufmann Publishers Inc. (1997)
Sieg, A., Mobasher, B., Burke, R.: Web search personalization with ontological user profiles. In: 16th ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 525–534. ACM (2007)
Soboroff, I., McCullough, D., Macdonald, C., Ounis, I., McCreadie, R.: Evaluating real-time search over tweets. In: 6th AAAI Conference on Weblogs and Social Media, ICWSM 2012. AAAI (2012)
Teevan, J., Dumais, S.T., Horvitz, E.: Potential for personalization. ACM Transactions on Computer-Human Interaction 17(1), 4:1–4:31 (2010)
Teevan, J., Dumais, S.T., Liebling, D.J.: To personalize or not to personalize: Modeling queries with variation in user intent. In: 31st ACM Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 163–170. ACM (2008)
Wei, X., Croft, W.: Lda-based document models for ad-hoc retrieval. In: 29th ACM Conference on Research and Development in Information Retrieval, SIGIR 2006, pp. 178–185. ACM (2006)
White, R.W., Bailey, P., Chen, L.: Predicting user interests from contextual information. In: 32nd ACM Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 363–370. ACM (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Harvey, M., Crestani, F. (2014). 21st Century Search and Recommendation: Exploiting Personalisation and Social Media. In: Paltoglou, G., Loizides, F., Hansen, P. (eds) Professional Search in the Modern World. Lecture Notes in Computer Science, vol 8830. Springer, Cham. https://doi.org/10.1007/978-3-319-12511-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-12511-4_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12510-7
Online ISBN: 978-3-319-12511-4
eBook Packages: Computer ScienceComputer Science (R0)