Abstract
XQuery is a recently developed query language for XML datasets. In this paper, we focus on the use of XQuery and other XML technologies for flat-file based scientific datasets. Traditionally, complex and domain-specific data layouts have complicated the processing of large datasets arising from scientific applications. The use of XML schemas and XQuery’s high-level structure can simplify the analysis on these datasets.
Though scientific data processing applications can be conveniently represented in XQuery, compiling them to achieve efficient execution involves a number of challenges. These are, 1) analysis of recursive functions to identify reduction computations involving only associative and commutative operations, 2) replacement of recursive functions with iterative constructs, 3) application of data-centric transformations on the structure of XQuery, and 4) translation of XQuery processing to an imperative language like C/C++, which is required for using a middleware that offers low-level data access functionality. This paper describes our solutions towards these problems and demonstrates significant benefits from the transformations we have developed.
This work was supported by NSF grant ACR-9982087, NSF CAREER award ACR-9733520, NSF grant ACR-0130437 and NSF grant ACI- 0203846. The equipment used for the experiments reported here was purchased under the grant EIA-9986052.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Afework, A., Beynon, M.D., Bustamante, F., Demarzo, A., Ferreira, R., Miller, R., Silberman, M., Saltz, J., Sussman, A., Tsang, H.: Digital dynamic telepathology - the Virtual Microscope. In: Proceedings of the 1998 AMIA Annual Fall Symposium. American Medical Informatics Association (November 1998)
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2002), available from http://www.w3.org/TR/xquery/
Chang, C., Moon, B., Acharya, A., Shock, C., Sussman, A., Saltz, J.: Titan: A high performance remote-sensing database. In: Proceedings of the 1997 International Conference on Data Engineering, April 1997, pp. 375–384. IEEE Computer Society Press, Los Alamitos (1997)
Choi, B., Fernandez, M., Simeon, J.: The XQuery Formal Semantics: A Foundation for Implementation and Opitmization (May 2002)
DeHaan, D., Toman, D., Consens, M.P., Tamer Ozsu, M.: A Comprehensive XQuery to SQL Translation Using Dynamic Interval Coding. In: Proceedings of the ACM SIGMOD, June 2003. ACM Press, New York (2003)
Draper, D., Fankhauser, P., Fernandez, M., Malhotra, A., Rose, K., Rys, M., Simion, J., Wadler, P.: XQuery 1.0 and XPath 2.0 Formal Semantics. W3C Working Draft (November 2002), available from http://www.w3.org/TR/query-semantics/
Ferreira, R., Agrawal, G., Saltz, J.: Compiler supported high-level abstractions for sparse disk-resident datasets. In: Proceedings of the International Conference on Supercomputing (ICS) (June 2002)
Palsberg. J., Schwartzbach, M.: Object-Oriented Type Inference. In: ACM SIGPLAN Sixth Annual Conference on Obejct-Oriented Programming Systems, Languages and Applications (1991)
Kodukula, I., Ahmed, N., Pingali, K.: Data-centric multi-level blocking. In: Proceedings of the SIGPLAN 1997 Conference on Programming Language Design and Implementation, June 1997, pp. 346–357 (1997)
Lieuwen, D.F., Dewitt, D.J.: A Transformation Based Approach for Optimizing Loops in Database Programming Languages. In: Proceedings of ACM SIGMOD, pp. 91–100 (1992)
Park, C.-W., Min, J.-K., Chung, C.-W.: Structural Function Inlining Techniques for Structurally Recursive XML Queries. In: Proceedings of Conference on Very Large Databases (VLDB) (September 2002)
Shatdal, A.: Architectural considerations for parallel query evaluation algorithms. Technical Report CS-TR-1996-1321, University of Wisconsin (1999)
Yao, B.B., Ozsu, M.T., Kennleyside, J.: XBench – A Family of Benchmarks for XML DBMSs. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, pp. 162–164. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Agrawal, G. (2004). Using XQuery for Flat-File Based Scientific Datasets. In: Lausen, G., Suciu, D. (eds) Database Programming Languages. DBPL 2003. Lecture Notes in Computer Science, vol 2921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24607-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-24607-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20896-9
Online ISBN: 978-3-540-24607-7
eBook Packages: Springer Book Archive