Skip to main content

Dynamic Agglomerative-Divisive Clustering of Clickthrough Data for Collaborative Web Search

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5981))

Included in the following conference series:

Abstract

In this paper, we model clickthroughs as a tripartite graph involving users, queries and concepts embodied in the clicked pages. We develop the Dynamic Agglomerative-Divisive Clustering (DADC) algorithm for clustering the tripartite clickthrough graph to identify groups of similar users, queries and concepts to support collaborative web search. Since the clickthrough graph is updated frequently, DADC clusters the graph incrementally, whereas most of the traditional agglomerative methods cluster the whole graph all over again. Moreover, clickthroughs are usually noisy and reflect diverse interests of the users. Thus, traditional agglomerative clustering methods tend to generate large clusters when the clickthrough graph is large. DADC avoids generating large clusters using two interleaving phases: the agglomerative and divisive phases. The agglomerative phase iteratively merges similar clusters together to avoid generating sparse clusters. On the other hand, the divisive phase iteratively splits large clusters into smaller clusters to maintain the coherence of the clusters and restructures the existing clusters to allow DADC to incrementally update the affected clusters as new clickthrough data arrives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proc. of ACM SIGKDD Conference (2000)

    Google Scholar 

  2. Church, K.W., Gale, W., Hanks, P., Hindle, D.: Using statistics in lexical analysis. In: Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon (1991)

    Google Scholar 

  3. Hoeffding, W.: Probability inequalities for sums of bounded random variables. JASA 58(301) (1963)

    Google Scholar 

  4. Joachims, T.: Optimizing search engines using clickthrough data. In: Proc. of ACM SIGKDD Conference (2002)

    Google Scholar 

  5. Leung, K.W.T., Ng, W., Lee, D.L.: Personalized concept-based clustering of search engine queries. IEEE TKDE 20(11) (2008)

    Google Scholar 

  6. Ng, W., Deng, L., Lee, D.L.: Mining user preference using spy voting for search engine personalization. ACM TOIT 7(4) (2007)

    Google Scholar 

  7. Rodrigues, P.P., Gama, J.: Semi-fuzzy splitting in online divisive-agglomerative clustering. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 133–144. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Rodrigues, P.P., Gama, J., Pedroso, J.P.: Hierarchical clustering of time-series data streams. IEEE TKDE 20(5) (2008)

    Google Scholar 

  9. Sun, J.T., Zeng, H.J., Liu, H., Lu, Y.: Cubesvd: A novel approach to personalized web search. In: Proc. of WWW Conference (2005)

    Google Scholar 

  10. Wang, X., Sun, J.T., Chen, Z., Zhai, C.: Latent semantic analysis for multiple-type interrelated data objects. In: Proc. of ACM SIGIR Conference (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Leung, K.WT., Lee, D.L. (2010). Dynamic Agglomerative-Divisive Clustering of Clickthrough Data for Collaborative Web Search. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12026-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12026-8_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12025-1

  • Online ISBN: 978-3-642-12026-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics