Abstract
As scaling out applications with multiple servers has become a popular industry practice, we investigate collaborating distributed Query Engines (QEs) to support graph-structured SQL dataflow processes. A SQL dataflow process consists of queries (optionally with UDFs) linked with relational dataflow. We focus on using Distributed Caching Platform (DCP) for inter-QEs data communication. While DCP has gained popularity lately, exchanging query results tuple-by-tuple through DCP is often inefficient due to the tiny granularity of cache access and the overhead of data conversion and interpretation. This has motivated us to explore a new and more efficient mechanism for inter-QEs communication, taking advantage of DCP’s binary protocol. We propose the page-flow approach characterized by extending and externalizing the database buffer pool to DCP to allow the producer QE to put query results as data pages (blocks) to the DCP to be retrieved by the consumer QE. In this way, the relational dataflow logically becomes binary page-flow; the tuples contained in the transferred pages are exactly in the format required by the relational operators thus can be feed in queries directly without any conversion. Further, using pages as mini-batches of tuples, enhances the latency of DCP access. We have implemented this mechanism on a cluster of PostgreSQL engines. Our experiments results demonstrate its value.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abouzeid, Bajda-Pawlikowski, K., Abadi, D., Silberschatz, A., Rasin, A.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: VLDB 2009 (2009)
Nori, A.: Distributed Caching Platforms. In: VLDB 2010 (2010)
Baer, J., Wang, W.: On the inclusion properties for multi-level cache hierarchies. In: Proc. ISCA 1988 (1988)
Bryant, R.E.: Data-Intensive Supercomputing: The case for DISC, CMU-CS-07-128 (2007)
Chen, Q., Hsu, M., Zeller, H.: Experience in Continuous analytics as a Service (CaaaS). In: EDBT 2011 (2011)
Chen, Q., Hsu, M.: Query Engine Net for Streaming Analytics. In: Proc. 19th International Conference on Cooperative Information Systems, CoopIS (2011)
DeWitt, D.J., Paulson, E., Robinson, E., Naughton, J., Royalty, J., Shankar, S., Krioukov, A.: Clustera: An Integrated Computation And Data Management System. In: VLDB 2008 (2008)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: EuroSys 2007 (March 2007)
Franklin, M.J., et al.: Continuous Analytics: Rethinking Query Processing in a NetworkEffect World. In: CIDR 2009 (2009)
Gedik, B., Andrade, H., Wu, K.-L., Yu, P.S., Doo, M.C.: SPADE: The System S Declarative Stream Processing Engine. In: ACM SIGMOD 2008 (2008)
Memcached (2010), http://www.memcached.org/
Membase, http://www.couchbase.com/
EhCache (2010), http://www.terracotta.org/
Vmware vFabric GemFire (2010), http://www.gemstone.com/
IBM Websphere Extreme Scale Cache (2010), http://www.ibm.com/
AppFabric Cache (2010), http://msdn.microsoft.com/
Liarou, E., et al.: Exploiting the Power of Relational Databases for Efficient Stream Processing. In: EDBT 2009 (2009)
The Wafflegrid Project, http://www.wafflegrid.com/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, Q., Hsu, M., Wu, R. (2013). Page-Flow in Query Engine Grid. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2012. Studies in Computational Intelligence, vol 443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32172-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-32172-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32171-9
Online ISBN: 978-3-642-32172-6
eBook Packages: EngineeringEngineering (R0)