Abstract
In this paper, we propose an algorithm to detect real world events from the click-through data. Our approach differs from the existing work as we: (i) consider the click-through data as collaborative query sessions instead of mere web logs proposed by many others (ii) integrate the semantics, structure, and content of queries and pages, and (iii) aim to achieve the overall objective via query clustering. The problem of event detection is transformed into query clustering by generating clusters using hybrid cover graphs where each hybrid cover graph corresponds to a real-world event. The evolutionary pattern for the co-occurrence of query-page pairs in a hybrid cover graph is imposed over a moving window period. Finally, we experimentally evaluated our proposed approach using a commercial search engine’s data collected over 3 months with about 20 million web queries and page clicks from 650,000 users. Our method outperforms the most recent event detection work proposed using complex methods in terms of metrics such as number of events detected, F-measures, entropy, recall etc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
De Kunder, M.: The size of the World Wide Web. World Wide Web Size (September 04, 2009), http://www.worldwidewebsize.com
Baeza-Yates, R.: Web Mining in Search Engines. In: Proceedings of the 27th Australasian Conference on Computer Science, New Zealand, vol. 26 (2004)
Zhao, Q., Liu, T.-Y., Bhowmick, S., Ma, W.-Y.: Event Detection from Evolution of Click-through Data. In: Proceedings of KDD, Philadelphia, PA, USA (2006)
Chen, L., Hu, Y., Nejdl, W.: DECK: Detecting Events from Web Click-Through Data. In: Eighth IEEE International Conference on Data Mining (ICDM), pp. 123–132 (2008)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: SIGKDD (2000)
Xue, G.-R., Zeng, H.-J., Chen, Z., Yu, Y., Ma, W.-Y., Xi, W., Fan, W.: Optimizing web search using web click-through data. In: ACM Proceedings of CIKM, pp. 118–126 (2004)
Wen, J., Mie, J., Zhang, H.: Clustering user queries of a search engine. In: Proceedings of the 10th International World Wide Web Conference (2001)
Baeza-Yates, R., Tiberi, A.: Extracting Semantic Relations from Query Logs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 76–85 (2007)
Federal, B.F., Fonseca, B.M., De Moura, E.S.: Using Association Rules to Discover Search Engines Related Queries. In: Proceedings of the 1st Conf. on Latin American Web Congress (2003)
Ester, M., Kriegel, H.-P., Jörg, S., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: 2nd International Conference on Knowledge Discovery, pp. 226–231 (1996)
Allan, J., Rapka, R., Lavarenko, V.: On-line New Event Detection and Tracking. In: SIGIR (1998)
Yang, Y., Pierce, T., Carbonell, J.G.: A Study of Retrospective and On-line Event Detection. In: SIGIR 1998 (1998)
Fung, G.P., Yu, J.X., Yu, P.S., Lu, H.: Parameter Free Bursty Events Detection in Text Streams. In: Proceedings of VLDB (2005)
White, R.W., Drucker, S.M.: Investigating Behavioral Variability in Web search. In: Proceedings of WWW, pp. 21–30 (2007)
Baeza-Yates, R.: Graphs from Search Engine Queries. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 1–8. Springer, Heidelberg (2007)
Zhao, Q., Bhowmick, S.S., Gruenwald, L.: Cleopatra: Evolutionary Pattern-Based Clustering of Web Usage Data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 323–333. Springer, Heidelberg (2006)
Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: the First ACM International Conference on Scalable Information Systems, Hong Kong (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Angajala, P.K., Madria, S.K., Linderman, M. (2012). ECO: Event Detection from Click-through Data via Query Clustering. In: Meersman, R., et al. On the Move to Meaningful Internet Systems: OTM 2012. OTM 2012. Lecture Notes in Computer Science, vol 7565. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33606-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-33606-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33605-8
Online ISBN: 978-3-642-33606-5
eBook Packages: Computer ScienceComputer Science (R0)