Abstract
XML element search engines return XML elements which are part of XML documents as search results. Existing studies related to XML element search are brought from the information retrieval techniques for document search. There are some ways to calculate global weights of each term from statistics of XML elements with 1) the same path expression or 2) the same tag. In the first approach, the more complex a path expression is, the less the number of XML elements with the path expression becomes. This is a problem that global term weights may be calculated using statistics of a few XML elements. Such global weights are never global. The second approach also has a problem that it does not consider document structures of XML elements. To resolve the problems, we propose a method for calculating accurate global weights. In our method, we regard a path expression as an array of tags. We relax the restriction of appearance order and appearance frequency of tags in a path expression to gather similar path expressions into the same class. Therefore, we try to decrease the number of classes which hardly contain elements. Our experimental results show that our method can integrate path expressions without decreasing search accuracy with a certain test collection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arvola, P., Geva, S., Kamps, J., Schenkel, R., Trotman, A., Vainio, J.: Overview of the INEX 2010 Ad Hoc Track. In: INEX 2010 Workshop Pre-proceedings, pp. 11–40 (December 2010)
Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G.: Intelligent Search on XML Data: Applications, Languages, Models, Implementations, and Benchmarks. LNCS, vol. 2818. Springer, Heidelberg (2003)
Grabs, T., Schek, H.-J.: PowerDB-XML: A Platform for Data–Centric and Document–Centric XML Processing. In: Bellahsène, Z., Chaudhri, A.B., Rahm, E., Rys, M., Unland, R. (eds.) XSym 2003. LNCS, vol. 2824, pp. 100–117. Springer, Heidelberg (2003)
Hatano, K., Amer-yahia, S., Srivastava, D.: Document-Scoring, for XML Information Retrieval using Structural Condition of XML Queries. In: IEICE technical report, pp. 13–18, DE2007-117 (October 2007)
Kamps, J., Geva, S., Trotman, A., Woodley, A., Koolen, M.: Overview of the INEX 2008 Ad Hoc Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 1–28. Springer, Heidelberg (2009)
Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective Keyword search in Relational Databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 563–574. ACM, New York (2006)
Lu, W., Robertson, S., MacFarlane, A.: Field-Weighted XML Retrieval Based on BM25. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 161–171. Springer, Heidelberg (2006)
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval, pp. 157–159. Cambridge University Press, Cambridge (2008)
Piwowarski, B., Gallinari, P.: A Bayesian Framework for XML Information Retrieval: Searching and Learning with the INEX Collection. Journal of Information Retrieval 8(4), 655–681 (2005)
Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 Extension to Multiple Weighted Fields. In: Proceedings of the 13 ACM International Conference on Information and Knowledge Management, pp. 42–49 (November 2004)
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: The Third Text Retrieval Conference (TREC-3), pp. 109–126 (1995)
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Journal of Information Processing and Management 24(5), 513–523 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keyaki, A., Hatano, K., Miyazaki, J. (2011). Relaxed Global Term Weights for XML Element Search. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds) Comparative Evaluation of Focused Retrieval. INEX 2010. Lecture Notes in Computer Science, vol 6932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23577-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-23577-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23576-4
Online ISBN: 978-3-642-23577-1
eBook Packages: Computer ScienceComputer Science (R0)