Abstract
Query logs are repositories that record all the interactions of users with a search engine. This incredibly rich user behavior data can be modeled using appropriate graph structures. In the recent years there has been an increasing amount of literature on studying properties, models, and algorithms for query-log graphs. Understanding the structure of such graphs, modeling user querying patterns, and designing algorithms for leveraging the latent knowledge (also known as the wisdom of the crowds) in those graphs introduces new challenges in the field of graph mining. The main goal of this paper is to present the reader with an example of these graph-structures, i.e., the Query-flow graph. This representation has been shown extremely effective for modeling user querying patterns and has been extensively used for developing real time applications. Moreover we present graph-based algorithmic solutions applied in the context of problems appearing in web applications as query recommendation and user-session segmentation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R.: Graphs from search engine queries. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 1–8. Springer, Heidelberg (2007)
Baeza-Yates, R., Tiberi, A.: Extracting semantic relations from query logs. In: Proceedings of the 13th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 76–85 (2007)
Baeza-Yates, R.A., Hurtado, C.A., Mendoza, M.: Query recommendation using query logs in search engines. In: Current Trends in Database Technology – EDBT Workshops, pp. 588–596 (2004)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 407–416 (2000)
Belkin, N.J.: The human element: helping people find what they don’t know. Communications of the ACM 43(8), 58–61 (2000)
Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S.: The query-flow graph: model and applications. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 609–618 (2008)
Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the 13th WWW Conf., Manhattan, USA, pp. 595–601. ACM Press, New York (2004)
Bonchi, F., Castillo, C., Donato, D., Gionis, A.: Topical query decomposition. In: Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 52–60 (2008)
Catledge, L., Pitkow, J.: Characterizing browsing behaviors on the world wide web. Computer Networks and ISDN Systems 6, 1065–1073 (1995)
Craswell, N., Szummer, M.: Random walks on the click graph. In: Proceedings of the 30th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 239–246 (2007)
Fonseca, B.M., Golgher, P.B., de Moura, E.S., Ziviani, N.: Using association rules to discover search engines related queries. In: LA-WEB, Washington, DC, USA, pp. 66–71 (2003)
Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 16th ACM Conference on Conference on Information and Knowledge Management (CIKM), pp. 699–708 (2008)
Karloff, H., Suri, S., Vassilvitskii, S.: A model of computation for mapreduce. In: Proceedings of the Symposium on Discrete Algorithms (SODA), pp. 938–948 (2010)
Mei, Q., Zhou, D., Church, K.: Query suggestion using hitting time. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 469–478 (2008)
Piwowarski, B., Zaragoza, H.: Predictive user click models based on click-through history. In: Proceedings of the 16th ACM Conference on Conference on Information and Knowledge Management (CIKM), pp. 175–182 (2007)
Poblete, B., Castillo, C., Gionis, A.: Dr. searcher and mr. browser: a unified hyperlink-click graph. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 1123–1132 (2008)
Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceeding of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 239–248 (2005)
Teevan, J., Adar, E., Jones, R., Potts, M.A.S.: Information re-retrieval: repeat queries in yahoo’s logs. In: Proceedings of the 30th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 151–158 (2007)
Wen, J.-R., Nie, J.-Y., Zhang, H.-J.: Clustering user queries of a search engine. In: Proceedings of the 10th International Conference on World Wide Web (WWW), pp. 162–168 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Donato, D. (2010). Graph Structures and Algorithms for Query-Log Analysis. In: Ferreira, F., Löwe, B., Mayordomo, E., Mendes Gomes, L. (eds) Programs, Proofs, Processes. CiE 2010. Lecture Notes in Computer Science, vol 6158. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13962-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-13962-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13961-1
Online ISBN: 978-3-642-13962-8
eBook Packages: Computer ScienceComputer Science (R0)