Page-Flow in Query Engine Grid

Chen, Qiming; Hsu, Meichun; Wu, Ren

doi:10.1007/978-3-642-32172-6_9

Qiming Chen²,
Meichun Hsu² &
Ren Wu²

Part of the book series: Studies in Computational Intelligence ((SCI,volume 443))

797 Accesses

Abstract

As scaling out applications with multiple servers has become a popular industry practice, we investigate collaborating distributed Query Engines (QEs) to support graph-structured SQL dataflow processes. A SQL dataflow process consists of queries (optionally with UDFs) linked with relational dataflow. We focus on using Distributed Caching Platform (DCP) for inter-QEs data communication. While DCP has gained popularity lately, exchanging query results tuple-by-tuple through DCP is often inefficient due to the tiny granularity of cache access and the overhead of data conversion and interpretation. This has motivated us to explore a new and more efficient mechanism for inter-QEs communication, taking advantage of DCP’s binary protocol. We propose the page-flow approach characterized by extending and externalizing the database buffer pool to DCP to allow the producer QE to put query results as data pages (blocks) to the DCP to be retrieved by the consumer QE. In this way, the relational dataflow logically becomes binary page-flow; the tuples contained in the transferred pages are exactly in the format required by the relational operators thus can be feed in queries directly without any conversion. Further, using pages as mini-batches of tuples, enhances the latency of DCP access. We have implemented this mechanism on a cluster of PostgreSQL engines. Our experiments results demonstrate its value.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abouzeid, Bajda-Pawlikowski, K., Abadi, D., Silberschatz, A., Rasin, A.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: VLDB 2009 (2009)
Google Scholar
Nori, A.: Distributed Caching Platforms. In: VLDB 2010 (2010)
Google Scholar
Baer, J., Wang, W.: On the inclusion properties for multi-level cache hierarchies. In: Proc. ISCA 1988 (1988)
Google Scholar
Bryant, R.E.: Data-Intensive Supercomputing: The case for DISC, CMU-CS-07-128 (2007)
Google Scholar
Chen, Q., Hsu, M., Zeller, H.: Experience in Continuous analytics as a Service (CaaaS). In: EDBT 2011 (2011)
Google Scholar
Chen, Q., Hsu, M.: Query Engine Net for Streaming Analytics. In: Proc. 19th International Conference on Cooperative Information Systems, CoopIS (2011)
Google Scholar
DeWitt, D.J., Paulson, E., Robinson, E., Naughton, J., Royalty, J., Shankar, S., Krioukov, A.: Clustera: An Integrated Computation And Data Management System. In: VLDB 2008 (2008)
Google Scholar
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: EuroSys 2007 (March 2007)
Google Scholar
Franklin, M.J., et al.: Continuous Analytics: Rethinking Query Processing in a NetworkEffect World. In: CIDR 2009 (2009)
Google Scholar
Gedik, B., Andrade, H., Wu, K.-L., Yu, P.S., Doo, M.C.: SPADE: The System S Declarative Stream Processing Engine. In: ACM SIGMOD 2008 (2008)
Google Scholar
Memcached (2010), http://www.memcached.org/
Membase, http://www.couchbase.com/
EhCache (2010), http://www.terracotta.org/
Vmware vFabric GemFire (2010), http://www.gemstone.com/
IBM Websphere Extreme Scale Cache (2010), http://www.ibm.com/
AppFabric Cache (2010), http://msdn.microsoft.com/
Liarou, E., et al.: Exploiting the Power of Relational Databases for Efficient Stream Processing. In: EDBT 2009 (2009)
Google Scholar
The Wafflegrid Project, http://www.wafflegrid.com/

Download references

Author information

Authors and Affiliations

HP Labs, Palo Alto, CA, USA
Qiming Chen, Meichun Hsu & Ren Wu

Authors

Qiming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Meichun Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Ren Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiming Chen .

Editor information

Editors and Affiliations

Software Engineering & Information, Technology Institute, Central Michigan University, Mt. Pleasant, 48859, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Q., Hsu, M., Wu, R. (2013). Page-Flow in Query Engine Grid. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2012. Studies in Computational Intelligence, vol 443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32172-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-32172-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32171-9
Online ISBN: 978-3-642-32172-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics