Abstract
With the growing popularity of streaming data model, processing queries over streaming data has become an important topic. Streaming data has received attention in a number of communities, including data mining, theoretical computer science, networking, and grid computing. We believe that streaming data processing involves challenges for compilers, which have not been addressed so far. Particularly, the following two questions are important:
-
How do we transform queries so that they can be correctly executed with a single pass on streaming data ?
-
How do we determine when a query, possibly after certain transformations, can be correctly executed with only a single pass on the dataset.
In this paper, we address these questions in the context of XML query language, XQuery. Because of XQuery’s single assignment nature and special constructs for dealing with sequences, the above questions can be answered more easily than for a general imperative language. However, we believe our work also forms the basis for addressing these questions for more general languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Qizx/open: An open source implementation of xml query in java, http://www.xfra.net/qizxopen/
Altinel, M., Franklin, M.J.: Efficient Filtering of XML Documents for Selective Dissemination of Information. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 53–64 (2000)
Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing Memory Requirements for Queries over Continuous Data Streams. ACM Transactions on Database Systems 29(1), 162–194 (2004)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: Proceedings of the 2002 ACM Symposium on Principles of Database Systems (PODS 2002) (Invited Paper). ACM Press, New York (2002)
Beech, D., Lawrence, S., Maloney, M., Mendelsohn, N., Thompson, H.: XML Schema part 1: Structures, W3C working draft (May 1999), Available at, http://www.w3.org/TR/1999/xmlschema-1
Biron, P., Malhotra, A.: XML Schema part 2: Datatypes, W3C working draft (May 1999), Available at, http://www.w3.org/TR/1999/xmlschema-2
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2002), available from: http://www.w3.org/TR/xquery/
Chan, C.Y., Felber, P., Garofalakis, M., Rastogi, R.: Efficient Filtering of XML documents with XPath Expressions. VLDB Journal: Very Large Data Bases 11(4), 354–379 (2002)
Chang, C., Moon, B., Acharya, A., Shock, C., Sussman, A., Saltz, J.: Titan: A high performance remote-sensing database. In: Proceedings of the 1997 International Conference on Data Engineering, April 1997, pp. 375–384. IEEE Computer Society Press, Los Alamitos (1997)
Diao, Y., Fischer, P., Franklin, M.J., Filter, Y.: Efficient and Scalable filtering of XML Documents. In: Proceedings of the 18th International Conference of Data Engineering (2002)
Fernandez, M.F., Siméon, J., Choi, B., Marian, A., Sur, G.: Implementing Xquery 1.0: The Galax experience. In: VLDB 2003: Proceedings of 29th International Conference on Very Large Data Bases, Berlin, Germany, September 9–12, 2003, pp. 1077–1080 (2003)
Ferreira, R., Moon, B., Humphries, J., Sussman, A., Saltz, J., Miller, R., Demarzo, A.: The Virtual Microscope. In: Proceedings of the 1997 AMIA Annual Fall Symposium, October 1997, pp. 449–453. American Medical Informatics Association, Hanley and Belfus, Inc. (1997); also available as University of Maryland Technical Report CS-TR-3777 and UMIACS-TR- 97-35
Ferris, C., Farrell, J.: What areWeb Services. Communications of the ACM (CACM), 31–35 (June 2003)
Florescu, D., Hillery, C., Kossmann, D., Lucas, P., Riccardi, F., Westmann, T., Carey, M.J., Sundararajan, A., Agrawal, G.: The BEA/XQRL Streaming XQuery Processor. In: VLDB 2003: Proceedings of 29th International Conference on Very Large Data Bases, Berlin, Germany, September 9–12, pp. 997–1008 (2003)
Foster, I., Kesselman, C., Nick, J.M., Tuecke, S.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. In: Open Grid Service Infrastructure Working Group, Global Grid Forum (June 2002)
Gehrke, J., Korn, F., Srivastava, D.: On Computing Correlated Aggregates over Continual Data Streams. In: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pp. 13–24 (2001)
Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams. In: Proceedings of 2000 Annual IEEE Symp. on Foundations of Computer Science (FOCS), pp. 359–366. ACM Press, New York (2000)
Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst. 28(1), 51–55 (2003)
Kay, M.H.: Saxon: The xslt and xquery processor, http://saxon.sourceforge.net/
Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams. In: Proceedings of the 30th International Conference on Very Large Data Bases (2004)
Kodukula, I., Ahmed, N., Pingali, K.: Data-centric multi-level blocking. In: Proceedings of the SIGPLAN 1997 Conference on Programming Language Design and Implementation, June 1997, pp. 346–357 (1997)
Ludascher, B., Mukhopadhayn, P., Papakonstantinou, Y.: A Transducer-Based XML Query Processor. In: Proceedings of the 28th International Conference on Very Large Data Bases (2002)
Olteanu, D., Kiesling, T., Bry, F.: An Evaluation of Regular Path Expressions with Qualifiers against XML Streams. In: Proceedings of ICDE 2003, Psoter Session (2003)
Peng, F., Chawathe, S.S.: XPath Queries on Streaming Data. In: Proceedings of the 2003 ACM SIGMOD international conference on on Management of data, pp. 431–442 (2003)
Schmidt, A.R., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB), pp. 974–985 (2002)
Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: A Language for Streaming Applications. In: Proceedings of Conference on Compiler Construction (CC) (April 2002)
Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Agrawal, G. (2006). Code Transformations for One-Pass Analysis. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-69330-7_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69329-1
Online ISBN: 978-3-540-69330-7
eBook Packages: Computer ScienceComputer Science (R0)