Database Support for Automatic Web Queries Categorization
- 907 Downloads
Abstract
The increasing usage of web search engines together with the potential added value of knowing user interests when submitting a query are in the roots of the categorization of web queries research. Categorizing queries is challenging both for the problems associated to gathering and analyzing user context information and for the ones related to deployment of the knowledge obtained. Related to the first one, an interesting open problem is to analyze the mapping, if any, between user queries and the content shown by the portal at the main page. The automatization of this problem would be very beneficial and among the challenges we underlay the implementation of a database to support the process. In this chapter, we firstly review the main approaches for web query categorization and then we concentrate on analysing the process and the database support required for its automatization.
Keywords
Query Categorization data warehouse search engines data miningPreview
Unable to display preview. Download preview PDF.
References
- 1.Annand, S.: Putting the user in context, 2006. In: ECML PKDD 2006 Workshop on Ubiquitous Knowledge Discovery for users (UKDU 2006), Berlin (2006)Google Scholar
- 2.Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: KDD 2000: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 407–416. ACM, New York (2000)CrossRefGoogle Scholar
- 3.Beitzel, S.M.: On understanding and classifying web queries. PhD Thesis, Illinois Institute of Technology (2006)Google Scholar
- 4.Beitzel, S.M., Jensen, E.C., Chowdhury, A., Frieder, O., Grossman, D.: Temporal analysis of a very large topically categorized web query log. J. Am. Soc. Inf. Sci. Technol. 58(2), 166–178 (2007)CrossRefGoogle Scholar
- 5.Beitzel, S.M., Jensen, E.C., Frieder, O., Grossman, D., Lewis, D.D., Chowdhury, A., Kolcz, A.: Automatic web query classification using labeled and unlabeled training data. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 581–582. ACM Press, New York (2005)CrossRefGoogle Scholar
- 6.Beitzel, S.M., Jensen, E.C., Lewis, D.D., Chowdhury, A., Frieder, O.: Automatic classification of web queries using very large unlabeled query logs. ACM Trans. Inf. Syst. 25(2), 9 (2007)Google Scholar
- 7.Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 757–766. ACM Press, New York (2007)Google Scholar
- 8.Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)CrossRefGoogle Scholar
- 9.Chung, S., McLeod, D.: Dynamic topic mining from news stream data. In: CoopIS/DOA/ODBASE, pp. 653–670 (2003)Google Scholar
- 10.Dumais, S.T., Chen, H.: Hierarchical classification of Web content. In: Belkin, N.J., Ingwersen, P., Leong, M.-K. (eds.) Proc. of SIGIR-2000, 23rd ACM International Conference on Research and Development in Information Retrieval, Athens, GR, pp. 256–263. ACM Press, New York (2000)CrossRefGoogle Scholar
- 11.Eibe, S., Valencia, M., Menasalvas, E., Segovia, J., Sousa, P.: Towards user context enhance search engine logs mining. In: Proceedings of the AWIC 2007 (2007)Google Scholar
- 12.Gravano, L., Hatzivassiloglou, V., Lichtenstein, R.: Categorizing web queries according to geographical locality. In: 12th ACM Conference on Information and Knowledge Management (CIKM 2003), November 3-8, pp. 325–333. ACM Press, New York (2003)Google Scholar
- 13.Jansen, B.J., Booth, D.L., Spink, A.: Determining the user intent of web search engine queries. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 1149–1150. ACM, New York (2007)CrossRefGoogle Scholar
- 14.Jansen, B.J., Spink, A.: How are we searching the world wide web? a comparison of nine search engine transaction logs. Inf. Process. Manage. 42(1), 248–263 (2006)CrossRefGoogle Scholar
- 15.Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Inf. Process. Manage. 36(2), 207–227 (2000)CrossRefGoogle Scholar
- 16.Jones, R., Diaz, F.: Temporal profiles of queries. ACM Trans. Inf. Syst. 25(3), 14 (2007)CrossRefGoogle Scholar
- 17.Joshi, H., Ito, S., Kanala, S., Hebbar, S., Bayrak, C.: Concept set extraction with user session context. In: ACM-SE 45: Proceedings of the 45th annual southeast regional conference, pp. 455–460. ACM, New York (2007)CrossRefGoogle Scholar
- 18.Kang, I., Kim, G.: Query type classification for web document retrieval (2003)Google Scholar
- 19.Kawai, Y., Kumamoto, T., Tanaka, K.: User preference modeling based on interest and impressions for news portal site systems. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 549–559. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 20.Kontostathis, A., Galitsky, L., Pottenger, W.M., Roy, S., Phelps, D.J.: A Survey of Emerging Trend Detection in Textual Data Mining. Springer, Heidelberg (2003)Google Scholar
- 21.Kukulenz, D., Ntoulas, A.: Answering bounded continuous search queries in the world wide web. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 551–560. ACM, New York (2007)CrossRefGoogle Scholar
- 22.Kules, B., Kustanowitz, J., Shneiderman, B.: Categorizing web search results into meaningful and stable categories using fast-feature techniques. In: JCDL, pp. 210–219 (2006)Google Scholar
- 23.Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in web search. In: WWW 2005: Proceedings of the 14th international conference on World Wide Web, pp. 391–400. ACM, New York (2005)CrossRefGoogle Scholar
- 24.Li, Y.: Mining ontology for automatically acquiring web user information needs. IEEE Transactions on Knowledge and Data Engineering 18(4), 554–568 (2006) (Senior Member-Ning Zhong)CrossRefGoogle Scholar
- 25.Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004: Proceedings of the 13th international conference on World Wide Web, pp. 13–19. ACM, New York (2004)CrossRefGoogle Scholar
- 26.Shen, D., Pan, R., Sun, J.-T., Pan, J.J., Wu, K., Yin, J., Yang, Q.: Query enrichment for web-query classification. ACM Trans. Inf. Syst. 24(3), 320–352 (2006)CrossRefGoogle Scholar
- 27.Shen, D., Sun, J.-T., Yang, Q., Chen, Z.: Building bridges for web query classification. In: SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 131–138. ACM, New York (2006)CrossRefGoogle Scholar
- 28.Sieg, A., Mobasher, B., Burke, R.D.: Representing context in web search with ontological user profiles. In: Kokinov, B., Richardson, D.C., Roth-Berghofer, T.R., Vieu, L. (eds.) CONTEXT 2007. LNCS, vol. 4635, pp. 439–452. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 29.Song, R., Luo, Z., Wen, J.-R., Yu, Y., Hon, H.-W.: Identifying ambiguous queries in web search. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 1169–1170. ACM, New York (2007)CrossRefGoogle Scholar
- 30.Spink, A., Jansen, B.J., Blakely, C., Koshman, S.: Overlap among major web search engines. In: ITNG 2006: Proceedings of the Third International Conference on Information Technology: New Generations (ITNG 2006), Washington, DC, USA, pp. 370–374. IEEE Computer Society, Los Alamitos (2006)CrossRefGoogle Scholar
- 31.Wen, J.-R., Nie, J.-Y., Zhang, H.: Clustering user queries of a search engine. In: WWW, pp. 162–168 (2001)Google Scholar