Abstract
Identifying meaningful document fragments is a major advantage achieved by encoding documents in XML. In scholarly articles, such document fragments include sections, subsections and paragraphs. XML information retrieval systems need to search document fragments relevant to queries from a set of XML documents in a digital library. We present Kikori-KS, an effective and efficient XML information retrieval system for scholartic articles. Kikori-KS accepts a set of keywords as a query. This form of query is simple yet useful because users are not required to understand XML query languages or XML schema. To meet practical demands for searching relevant fragments in scholartic articles, we have developed a user-friendly interface for displaying search results. Kikori-KS was implemented on top of a relational XML database system developed by our group. By carefully designing the database schema, Kikori-KS handles a huge number of document fragments efficiently. Our experiments using INEX test collection show that Kikori-KS achieved an acceptable search time and with relatively high precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
W3C: XQuery 1.0 and XPath 2.0 Full-Text (2006), http://www.w3.org/TR/xquery-full-text/
W3C: XML Path Language (XPath) Version 1.0 (1999), http://www.w3.org/TR/xpath
W3C: XQuery 1.0: An XML Query Language (2006), http://www.w3.org/TR/xquery/
Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology 1, 110–141 (2001)
Fujimoto, K., Shimizu, T., Terada, N., Hatano, K., Suzuki, Y., Amagasa, T., Kinutani, H., Yoshikawa, M.: Implementation of a high-speed and high-precision XML information retrieval system on relational databases. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 254–267. Springer, Heidelberg (2006)
INEX: INitiative for the Evaluation of XML Retrieval (2005), http://inex.is.informatik.uni-duisburg.de/2005/
Clarke, C.L.A.: Controlling overlap in content-oriented XML retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, pp. 314–321 (2005)
Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: A semantic search engine for XML. In: Proceedings of the 29th International Conference on Very Large Data Bases, Berlin, Germany, pp. 45–56 (2003)
Grabs, T., Schek, H.J.: ETH Zürich at INEX: Flexible information retrieval from XML with PowerDB-XML. In: Proceedings of the First Workshop of the INitiative for the Evaluation of XML Retrieval, Schloss Dagstuhl, Germany, pp. 141–148 (2002)
Amer-Yahia, S., Curtmola, E., Deutsch, A.: Flexible and efficient XML search with complex full-text predicates. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, USA, pp. 575–586 (2006)
Liu, F., Yu, C.T., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, USA, pp. 563–574 (2006)
Theobald, M., Schenkel, R., Weikum, G.: An efficient and versatile query engine for TopX search. In: Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, pp. 625–636 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shimizu, T., Terada, N., Yoshikawa, M. (2006). Kikori-KS: An Effective and Efficient Keyword Search System for Digital Libraries in XML. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds) Digital Libraries: Achievements, Challenges and Opportunities. ICADL 2006. Lecture Notes in Computer Science, vol 4312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11931584_42
Download citation
DOI: https://doi.org/10.1007/11931584_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49375-4
Online ISBN: 978-3-540-49377-8
eBook Packages: Computer ScienceComputer Science (R0)