Abstract
In many data processing tasks, declarative query programming offers substantial benefit over manual data analysis: the query processors found in declarative systems can use powerful algorithms such as query planning to choose high-level execution strategies during compilation. However, the principal downside of such languages is that their primitives must be carefully curated, to allow the query planner to correctly estimate their overhead. In this paper, we examine this challenge in one such system, PQL/Java. PQL/Java adds a powerful declarative query language to Java to enable and automatically parallelise queries over the Java heap. In the past, the language has not provided any support for custom user-designed datatypes, as such support requires complex interactions with its query planner and backend.
We examine PQL/Java and its intermediate language in detail and describe a new system that simplifies PQL/Java extensions. This system provides a language that permits users to add new primitives with arbitrary Java computations, and new rewriting rules for optimisation. Our system automatically stages compilation and exploits constant information for dead code elimination and type specialisation. We have re-written our PQL/Java backend in our extension language, enabling dynamic and staged compilation.
We demonstrate the effectiveness of our extension language in several case studies, including the efficient integration of SQL queries, and by analysing the run-time performance of our rewritten prototype backend.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Allen, E., Chase, D., Hallett, J., Luchangco, V., Maessen, J.W., Ryu, S., Steele, G., Tobin-Hochstadt, S.: The Fortress Language Specification. Tech. rep., Sun Microsystems (2008)
Bosboom, J., Rajadurai, S., Wong, W.F., Amarasinghe, S.: StreamJIT: A Commensal Compiler for High-performance Stream Programming. In: OOPSLA 2014, pp. 177–195. ACM, New York (2014)
Bruneton, E., Lenglet, R., Coupaye, T.: ASM: a code manipulation tool to implement adaptable systems. Adaptable and Extensible Component Systems (2002)
Callahan, D., Chamberlain, B., Zima, H.: The cascade high productivity language. In: Proceedings of the Ninth International Workshop on High-Level Parallel Programming Models and Supportive Environments, pp. 52–60 (April 2004)
Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R., Weizenbaum, N.: FlumeJava: easy, efficient data-parallel pipelines. In: Programming Language Design and Implementation (PLDI), pp. 363–375. ACM, New York (2010)
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing
Cheung, A., Solar-Lezama, A., Madden, S.: Optimizing database-backed applications with query synthesis. SIGPLAN Not. 48(6), 3–14 (2013)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Operating Systems Design & Implementation (OSDI). USENIX Association (2004)
Duffy, J., Essey, E.: Parallel LINQ: Running queries on multi-core processors. MSDN Magazine (October 2007)
Erdweg, S., Rieger, F.: A framework for extensible languages. In: Järvi, J., Kästner, C. (eds.) GPCE, pp. 3–12. ACM (2013)
Heise, A., Rheinländer, A., Leich, M., Leser, U., Naumann, F.: Meteor/sopremo: An extensible query language and operator model. In: Proceedings of the International Workshop on End-to-end Management of Big Data (BigData) in Conjunction with VLDB, Istanbul, Turkey (2012)
Immerman, N.: Descriptive Complexity. Springer (1998)
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: A not-so-foreign language for data processing. In: Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 1099–1110. ACM (2008)
Reichenbach, C., Smaragdakis, Y., Immerman, N.: PQL: A purely-declarative Java extension for parallel programming. In: Noble, J. (ed.) ECOOP 2012. LNCS, vol. 7313, pp. 53–78. Springer, Heidelberg (2012)
Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: ACM SIGMOD Int. Conf. on Management of Data, SIGMOD 1979, pp. 23–34. ACM, New York (1979), http://doi.acm.org/10.1145/582095.582099
Sujeeth, A., Rompf, T., Brown, K., Lee, H., Chafi, H., Popic, V., Wu, M., Prokopec, A., Jovanovic, V., Odersky, M., Olukotun, K.: Composition and reuse with compiled domain-specific languages. In: Castagna, G. (ed.) ECOOP 2013. LNCS, vol. 7920, pp. 52–78. Springer, Heidelberg (2013)
Vafeiadis, V., Narayan, C.: Relaxed separation logic: A program logic for C11 concurrency. In: Hosking, A.L., Eugster, P.T., Lopes, C.V. (eds.) OOPSLA, pp. 867–884. ACM (2013)
Van Wyk, E., Bodin, D., Gao, J., Krishnan, L.: Silver: an extensible attribute grammar system. Science of Computer Programming 75(1-2), 39–54 (2010)
Yang, H.C., Dasdan, A., Hsiao, R.L., Parker, D.S.: Map-reduce-merge: simplified relational data processing on large clusters. In: ACM SIGMOD Int. Conf. on Management of Data, SIGMOD 2007, pp. 1029–1040. ACM, New York (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ackermann, H., Reichenbach, C., Müller, C., Smaragdakis, Y. (2015). A Backend Extension Mechanism for PQL/Java with Free Run-Time Optimisation. In: Franke, B. (eds) Compiler Construction. CC 2015. Lecture Notes in Computer Science(), vol 9031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46663-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-46663-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46662-9
Online ISBN: 978-3-662-46663-6
eBook Packages: Computer ScienceComputer Science (R0)