Abstract
Stream data processing poses many challenges. Two important characteristics of stream data processing – bursty arrival rates and the need for near real-time performance requirement – challenge the allocation of limited resources in the system. Several scheduling algorithms (e.g., Chain strategy) have been proposed for minimizing the maximal memory requirements in the literature. In this paper, we propose novel scheduling strategies to minimize tuple latency as well as total memory requirement. We first introduce a path capacity strategy (PCS) with the goal of minimizing tuple latency. We then compare the PCS and the Chain strategy to identify their limitations and propose additional scheduling strategies that improve upon them. Specifically, we introduce a segment strategy (SS) with the goal of minimizing the memory requirement, and its simplified version. In addition, we introduce a hybrid strategy, termed the threshold strategy (TS), to addresses the combined optimization of both tuple latency and memory requirement. Finally, we present the results of a wide range of experiments conducted to evaluate the efficiency and the effectiveness of the proposed scheduling strategies.
This work was supported, in part, by NSF grants IIS-0123730 and ITR-0121297
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hellerstein, J., Franklin, M., et al.: Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin 23, 7–18 (2000)
Motwani, R., Widom, J., et al.: Query processing, approximation, and resource management in a data stream management system. In: Proc. First Biennial Conf. on Innovative Data Systems Research (2003)
Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., TatbuL, N., Zdonik, S.: Monitoring streams - a new class of data management applications. In: Proc. Of the 2002 Intl. Conf. On VLDB (2002)
Chen, J., Dewitt, D., Tian, F., Wang, Y.: Niagaracq: A scalable continuous query system for internet databases. In: Proc. of the 2000 ACM SIGMOD, pp. 379–390 (2000)
Terry, D., Goldberg, D., Nichols, D., Oki, B.: Continuous queries over append-only databases. In: Proc. of the 1992 ACM SIGMOD, pp. 321–330 (1992)
Sullivan, M.: Tribeca: A stream database manager for network traffic analysis. In: Proc. of the 1996 Intl. Conf. on Very Large Data Bases, p. 594 (1996)
Babcock, B., Babu, S., Datar, M., Motwani, R.: Chain: Operator scheduling for memory minimization in stream systems. In: Proc. of the 2003 ACM SIGMOD (2003)
Jiang, Q., Chakravarthy, S.: Analysis and validation of continuous queries over data streams (2003), http://itlab.uta.edu/sharma/Projects/MavHome/files/QA-SPJQueries2.pdf
Carney, D., Çetintemel, U., Rasin, A., Zdonik, S., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: Proc. Of the 2003 Intl. Conf. On VLDB (2003)
Viglas, S., Naughton, J.: Rate-based query optimization for streaming information sources. In: Proc. of the 2002 ACM SIGMOD (2002)
Amsaleg, L., Franklin, M., Tomasic, A.: Dynamic query operator scheduling for wide-area remote access. Journal of Distributed and Parallel Databases 3 (1998)
Urhan, T., Franklin, M.: Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Engineering Bulletin 23, 27–33 (2000)
Jiang, Q., Chakravarthy, S.: Data stream management system for mavhome. In: Proc. of the 19th ACM SAC 2004 (2004)
Jiang, Q., Chakravarthy, S.: Queueing analysis of relational operators for continuous data streams. In: Proc. of 12th Intl. Conf. on Information and Knowledge Management (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, Q., Chakravarthy, S. (2004). Scheduling Strategies for Processing Continuous Queries over Streams. In: Williams, H., MacKinnon, L. (eds) Key Technologies for Data Management. BNCOD 2004. Lecture Notes in Computer Science, vol 3112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27811-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-27811-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22382-5
Online ISBN: 978-3-540-27811-5
eBook Packages: Springer Book Archive