Algorithmic Languages and Calculi pp 382-414 | Cite as

# The Formal Reconstruction and Speedup Of The Linear Time Fragment Of Willard’s Relational Calculus Subset

## Abstract

We demonstrate how several programming language concepts and methods can be used economically to obtain an improved solution to a difficult algorithmic problem. The problem is to compile a subset RCS of Relational Calculus defined by Willard (1978) in a novel way so that efficient run-time query performance is guaranteed. Willard gives an algorithm to compile each query *q* belonging to RCS so that it executes in *O*(*n* log^{ d }*n* + *o*) steps and *O*(*n*) space, where *n* and *o* are respectively the input and output set sizes, and *d* is a parameter associated with the syntax of query *q*. Willard’s time bounds are based on the assumption that hashing unit-space data takes unit time.

In this paper we use a set-theoretic complexity measure and formal transformational techniques to reconstruct the linear time fragment of RCS in a simplified way. In doing this, we show how complexity can be determined by language abstraction and algebraic reasoning without resorting to low level counting arguments. This approach shortens Willard’s proofs considerably, and facilitates an implementation. Finally, we show that the implementation can be typed in a restricted form of a set-theoretic type system based on Cai *et al*. (1991) and Keller and Paige (1995), which guarantees that each hash operation used to implement RCS can be simulated in real-time on a pointer machine model of computation. This improves Willard’s linear expected time result to linear worst case time, and demonstrates for the first time how type theory can be used as a tool to obtain an order of magnitude algorithmic speedup.

## Keywords

Inference Rule Query Graph Quantifier Elimination Primitive Operation Pointer Machine## References

- Anderson, H. R. (1994). Model checking and boolean graphs.
*Theoretical Computer Science*, 126 (1): 3–30.MathSciNetCrossRefGoogle Scholar - Arnold, A. and Crubille, P. (1988). A linear algorithm to solve fixed point equations on transition systems. In
*Information Processing Letters*, volume 29, pages 57–66.Google Scholar - Cai, J., Facon, P., Henglein, F., Paige, R. and Schonberg, E. (1991). Type transformation and data structure choice. In B. Moeller, editor,
*Constructing Programs From Specifications*, pages 126–164. North-Holland, Amsterdam.Google Scholar - Cai, J. and Paige, R. (1987). Binding performance at language design time. In
*Proc. Fourteenth ACM Symp. on Principles of Programming Languages*, pages 85–97.Google Scholar - Cai, J. and Paige, R. (1992). Languages polynomial in the input plus output. In M. Nivat, C. Rattray, T. Rus, and G. Scollo, editors,
*Algebraic Methodology and Software Technology*, Workshops in Computing, pages 287–302. Springer-Verlag. Conference Record of the Second AMAST.Google Scholar - Cai, J. and Paige, R. (1995). Using multiset discrimination to solve language processing problems without hashing.
*Theoretical Computer Science*, 145(1–2):189–228. URL http://cs.nyu.edu/cs/faculty/paige/papers/hash.ps.zbMATHMathSciNetCrossRefGoogle Scholar - Cleavaland, R. and Steffen, B. (1992). A linear-time model-checking algorithm for the alternation-free modal-calculus. Technical report, Computer Science Dept., North Carolina University.Google Scholar
- Codd, E. F. (1970). A relational model of data for large shared data banks.
*CACM*, 13 (6): 377–387.zbMATHCrossRefGoogle Scholar - Hunt-III, H. B., Szymanski, T. G., and Ullman, J. D. (1977). Operations on sparse relations.
*CACM*, 20 (3): 127–132.CrossRefGoogle Scholar - Keller, J. and Paige, R. (1995). Program derivation with verified transformations — a case study.
*CPAM*, 48 (9–10).Google Scholar - Knuth, D. E. (1973).
*The Art of Computer Programming, Vol 1: Fundamental Algorithms*. Addison-Wesley.Google Scholar - Nielson, H. R. (1984).
*Hoare Logic’s for Run-time Analysis of Programs*. PhD thesis, University of Edinburgh, Scotland.Google Scholar - Nielson, H. R. (1987). A Hoare-like Proof System for Run-time Analysis of Programs.
*Science of Computer Programming*, 9.Google Scholar - Paige, R. (1989). Real-time simulation of a set machine on a ram. In N. Janicki and W. Koczkodaj, editors,
*Computing and Information*, volume II, pages 69–73. Canadian Scholars’ Press, Toronto.Google Scholar - Paige, R. (1994). Efficient translation of external input in a dynamically typed language. In B. Pehrson and I. Simon, editors,
*Technology and Foundations — Information Processing 94*, volume 1 of*IFIP Transactions A-51*, pages 603–608. North-Holland, Amsterdam. Conference Record of IFIP Congress 94.Google Scholar - Paige, R. and Yang, Z. (1997). High level reading and data structure compilation. 24th Annual ACM SIGPLAN-SIGACT Symposium on Principle of Programming Languages, pages 456–469.CrossRefGoogle Scholar
- Schwartz, J. (1974).
*On Programming: An Interim Report on the SETL Project, Installments I and II*. New York University, New York.Google Scholar - Schwartz, J. (1975a). Automatic data structure choice in a language of very high level.
*CACM*, 18 (12): 722–728.zbMATHCrossRefGoogle Scholar - Schwartz, J. (1975b). Optimization of very high level languages, parts i, ii.
*J. of Computer Languages*, 1 (2, 3): 161–218.zbMATHCrossRefGoogle Scholar - Schwartz, J., Dewar, R., Dubinsky, E. and Schonberg, E. (1986).
*Programming with Sets: An Introduction to SETL*. Springer-Verlag, New York.zbMATHCrossRefGoogle Scholar - Snyder, K. (1990). The SETL2 programming language. Technical Report 490, Courant Insititute, New York University.Google Scholar
- Suppes, P. (1972).
*Axiomatic Set Theory*. Dover.zbMATHGoogle Scholar - Tarjan, R. (1979). A class of algorithms which require nonlinear time to maintain disjoint sets.
*Journal of computer and System Sciences*, 18: 110–127.zbMATHMathSciNetCrossRefGoogle Scholar - Willard, D. E. (1978).
*Predicate Oriented Database Search Algorithms*. PhD thesis, Harvard.Google Scholar - Willard, D. E. (1983). Predicate retrieval theory. Technical Report 83–3, SUNY Albany.Google Scholar
- Willard, D. E. (1990). Quasi-linear algorithms for processing relational data base expressions. In
*Proceedings of the*9^{th}*ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems*, pages 243–257.Google Scholar - Willard, D. E. (1996). Applications of range query theory to relational data base join and selection operations.
*J. Computer and System Sci*., 52: 157–169.zbMATHMathSciNetCrossRefGoogle Scholar