Skip to main content

Graph Structures and Algorithms for Query-Log Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6158))

Abstract

Query logs are repositories that record all the interactions of users with a search engine. This incredibly rich user behavior data can be modeled using appropriate graph structures. In the recent years there has been an increasing amount of literature on studying properties, models, and algorithms for query-log graphs. Understanding the structure of such graphs, modeling user querying patterns, and designing algorithms for leveraging the latent knowledge (also known as the wisdom of the crowds) in those graphs introduces new challenges in the field of graph mining. The main goal of this paper is to present the reader with an example of these graph-structures, i.e., the Query-flow graph. This representation has been shown extremely effective for modeling user querying patterns and has been extensively used for developing real time applications. Moreover we present graph-based algorithmic solutions applied in the context of problems appearing in web applications as query recommendation and user-session segmentation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R.: Graphs from search engine queries. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 1–8. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Baeza-Yates, R., Tiberi, A.: Extracting semantic relations from query logs. In: Proceedings of the 13th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 76–85 (2007)

    Google Scholar 

  3. Baeza-Yates, R.A., Hurtado, C.A., Mendoza, M.: Query recommendation using query logs in search engines. In: Current Trends in Database Technology – EDBT Workshops, pp. 588–596 (2004)

    Google Scholar 

  4. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 407–416 (2000)

    Google Scholar 

  5. Belkin, N.J.: The human element: helping people find what they don’t know. Communications of the ACM 43(8), 58–61 (2000)

    Article  Google Scholar 

  6. Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S.: The query-flow graph: model and applications. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 609–618 (2008)

    Google Scholar 

  7. Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the 13th WWW Conf., Manhattan, USA, pp. 595–601. ACM Press, New York (2004)

    Chapter  Google Scholar 

  8. Bonchi, F., Castillo, C., Donato, D., Gionis, A.: Topical query decomposition. In: Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 52–60 (2008)

    Google Scholar 

  9. Catledge, L., Pitkow, J.: Characterizing browsing behaviors on the world wide web. Computer Networks and ISDN Systems 6, 1065–1073 (1995)

    Article  Google Scholar 

  10. Craswell, N., Szummer, M.: Random walks on the click graph. In: Proceedings of the 30th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 239–246 (2007)

    Google Scholar 

  11. Fonseca, B.M., Golgher, P.B., de Moura, E.S., Ziviani, N.: Using association rules to discover search engines related queries. In: LA-WEB, Washington, DC, USA, pp. 66–71 (2003)

    Google Scholar 

  12. Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 16th ACM Conference on Conference on Information and Knowledge Management (CIKM), pp. 699–708 (2008)

    Google Scholar 

  13. Karloff, H., Suri, S., Vassilvitskii, S.: A model of computation for mapreduce. In: Proceedings of the Symposium on Discrete Algorithms (SODA), pp. 938–948 (2010)

    Google Scholar 

  14. Mei, Q., Zhou, D., Church, K.: Query suggestion using hitting time. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 469–478 (2008)

    Google Scholar 

  15. Piwowarski, B., Zaragoza, H.: Predictive user click models based on click-through history. In: Proceedings of the 16th ACM Conference on Conference on Information and Knowledge Management (CIKM), pp. 175–182 (2007)

    Google Scholar 

  16. Poblete, B., Castillo, C., Gionis, A.: Dr. searcher and mr. browser: a unified hyperlink-click graph. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 1123–1132 (2008)

    Google Scholar 

  17. Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceeding of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 239–248 (2005)

    Google Scholar 

  18. Teevan, J., Adar, E., Jones, R., Potts, M.A.S.: Information re-retrieval: repeat queries in yahoo’s logs. In: Proceedings of the 30th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 151–158 (2007)

    Google Scholar 

  19. Wen, J.-R., Nie, J.-Y., Zhang, H.-J.: Clustering user queries of a search engine. In: Proceedings of the 10th International Conference on World Wide Web (WWW), pp. 162–168 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Donato, D. (2010). Graph Structures and Algorithms for Query-Log Analysis. In: Ferreira, F., Löwe, B., Mayordomo, E., Mendes Gomes, L. (eds) Programs, Proofs, Processes. CiE 2010. Lecture Notes in Computer Science, vol 6158. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13962-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13962-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13961-1

  • Online ISBN: 978-3-642-13962-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics