Abstract
XML documents often have duplicated objects, with a view to maintaining tree structure. Once object duplication occurs, two nodes may have the same object as the child. However, this child object is not discovered by the typical LCA (Lowest Common Ancestor) based approaches in XML keyword search. This may lead to the problem of missing answers in those approaches. To solve this problem, we propose a new approach, in which we model an XML document as a so-called XML IDREF graph so that all instances of the same object are linked. Thereby, the missing answers can be found by following these links. Moreover, to improve the efficiency of the search over XML IDREF graph, we exploit the hierarchical structure of the XML IDREF graph so that we can generalize the efficient techniques of the LCA-based approaches for searching over XML IDREF graph. The experimental results show that our approach outperforms the existing approaches in term of both effectiveness and efficiency.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using BANKS. In: ICDE (2002)
Dreyfus, S.E., Wagner, R.A.: The steiner problem in graphs. Networks (1971)
Fong, J., Wong, H.K., Cheng, Z.: Converting relational database into XML documents with DOM. Information & Software Technology (2003)
Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked keyword search over XML documents. In: SIGMOD (2003)
He, H., Wang, H., Yang, J., Yu, P.S.: BLINKS: ranked keyword searches on graphs. In: SIGMOD (2007)
Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Hrishikesh Karambelkar, R.D.: Bidirectional expansion for keyword search on graph databases. In: VLDB (2005)
Kargar, M., An, A.: Keyword search in graphs: finding r-cliques. PVLDB (2011)
Le, T.N., Ling, T.W., Jagadish, H.V., Lu, J.: Object semantics for XML keyword search. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part II. LNCS, vol. 8422, pp. 311–327. Springer, Heidelberg (2014)
Le, T.N., Wu, H., Ling, T.W., Li, L., Lu, J.: From structure-based to semantics-based: Towards effective XML keyword search. In: Ng, W., Storey, V.C., Trujillo, J.C. (eds.) ER 2013. LNCS, vol. 8217, pp. 356–371. Springer, Heidelberg (2013)
Li, G., Feng, J., Wang, J., Zhou, L.: Effective keyword search for valuable LCAs over XML documents. In: CIKM (2007)
Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: EASE: Efficient and adaptive keyword search on unstructured, semi-structured and structured data. In: SIGMOD (2008)
Li, L., Le, T.N., Wu, H., Ling, T.W., Bressan, S.: Discovering semantics from data-centric XML. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part I. LNCS, vol. 8055, pp. 88–102. Springer, Heidelberg (2013)
Li, Y., Yu, C., Jagadish, H.V.: Schema-free XQuery. In: VLDB (2004)
Liu, Z., Chen, Y.: Reasoning and identifying relevant matches for XML keyword search. PVLDB (2008)
Tao, Y., Papadopoulos, S., Sheng, C., Stefanidis, K.: Nearest keyword search in XML documents. In: SIGMOD (2011)
Termehchy, A., Winslett, M.: EXTRUCT: using deep structural information in XML keyword search. PVLDB (2010)
Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. In: SIGMOD (2005)
Zhou, J., Bao, Z., Wang, W., Ling, T.W., Chen, Z., Lin, X., Guo, J.: Fast SLCA and ELCA computation for XML keyword queries based on set intersection. In: ICDE (2012)
Zhou, R., Liu, C., Li, J.: Fast ELCA computation for keyword queries on XML data. In: EDBT (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Le, T.N., Zeng, Z., Ling, T.W. (2014). Finding Missing Answers due to Object Duplication in XML Keyword Search. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8644. Springer, Cham. https://doi.org/10.1007/978-3-319-10073-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-10073-9_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10072-2
Online ISBN: 978-3-319-10073-9
eBook Packages: Computer ScienceComputer Science (R0)