A Structural Approach to Query Language Design

  • Peter Buneman
  • Val Tannen


This chapter is motivated by the question “are there any clean mathematical principles behind the design of query languages?” One can hardly blame the reader of various recent standards for asking this question. The authors try to sketch what such a mathematical framework could be. One of the classifying principles they use extensively is that of languages being organised around type systems, with language primitives corresponding to constructors and deconstructors for each type. There is some value in casting the concepts in as general a form as possible; hence the use of the language of category theory for describing them. Once the semantic framework is discussed, the chapter presents a calculus, itself a language, that could be (and was) used as an internal representation for various user languages. The discussion is relevant to all kinds of data models: relational, object-relational, object-oriented, and semi-structured.


Structural Approach Natural Transformation Query Language Category Theory Relational Algebra 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 14.1
    Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.Google Scholar
  2. 14.2
    Peter Buneman, Susan Davidson, Gerd Hillebrand, and Dan Suciu. A query language and optimisation techniques for unstructured data . In Proceedings of ACM SIGMOD International Conference on Management of Data, 1996.Google Scholar
  3. 14.3
    Peter Buneman, Mary Fernandez, and Dan Suciu. UnQL: A query language and algebra for semistructured data based on structural recursion. VLDB Journal, 9(1):75–110, 2000.CrossRefGoogle Scholar
  4. 14.4
    Val Breazu-Tannen, Peter Buneman, and Limsoon Wong. Naturally embedded query languages. In J. Biskup and R. Hull, editors, LNCS 646: Proceedings of 4th International Conference on Database Theory, Berlin, Germany, October, 1992, pages 140–154. Springer-Verlag, 1992. Available as UPenn Technical Report MS-CIS-92–47.Google Scholar
  5. 14.5
    P. Buneman, S. Davidson, K. Hart, C. Overton, and L. Wong. A data transformation system for biological data sources. In Proceedings of VLDB’95, Zurich, 1995.Google Scholar
  6. 14.6
    Catriel Beeri and Yoram Kornatzky. Algebraic optimisation of object oriented query languages. Theoretical Computer Science, 116(l):59–94, August 1993.MathSciNetMATHCrossRefGoogle Scholar
  7. 14.7
    Peter Buneman, Shamim Naqvi, Val Tannen, and Limsoon Wong. Principles of programming with complex objects and collection types. Theoretical Computer Science, 149(l):3–48, September 1995.MathSciNetMATHCrossRefGoogle Scholar
  8. 14.8
    V. Breazu-Tannen and R. Subrahmanyam. Logical and computational aspects of programming with sets/bags/lists. In LNCS 510: Proceedings of 18th International Colloquium on Automata, Languages, and Programming, Madrid, Spain, July 1991, pages 60–75. Springer-Verlag, 1991.Google Scholar
  9. 14.9
    Michael Barr and Charles Wells. Toposes, Triples, and Theories. Springer-Verlag, 1985.MATHGoogle Scholar
  10. 14.10
    R. G. G. Cattell, editor. The Object Database Standard: ODMG-93. Morgan Kaufmann, 1996.MATHGoogle Scholar
  11. 14.11
    Sophie Cluet and Claude Delobel. A general framework for the optimisation of object oriented queries. In M. Stonebraker, editor, Proceedings ACM SIGMOD International Conference on Management of Data, pages 383–392, San Diego, California, June 1992.Google Scholar
  12. 14.12
    Sophie Cluet. Langages et Optimisation de requetes pour Systemes de Gestion de Base de donnees oriente-objet. PhD thesis, Universite de Paris-Sud, 1991.Google Scholar
  13. 14.13
    S.B. Davidson, S. Harker and V. Tannen. The information integration system K2. In T. Critchlow and Z. Lacroix, editors, Bioinformatics: Managing Scientific Data, Elsevier, 2003.Google Scholar
  14. 14.14
    S. Davidson, C. Overton, V. Tannen, and L. Wong. BioKleisli: a digital library for biomedical researchers. Journal of Digital Libraries, 1997.Google Scholar
  15. 14.15
    A. Deutsch, L. Popa, and V. Tannen. Physical data independence, constraints and optimization with universal plans. In VLDB, 1999.Google Scholar
  16. 14.16
    S. Eilenberg and J.C. Moore. Adjoint functors and triples. Illinois Journal of Mathematics, 9:381–398, 1965.MathSciNetMATHGoogle Scholar
  17. 14.17
    L. Fegaras and D. Maier. Towards an effective calculus for object query languages. In Proceedings of ACM SIGMOD International Conference on Management of Data, pages 47–58, San Jose, California, May 1995.Google Scholar
  18. 14.18
    L. Fegaras and D. Maier. An algebraic framework for physical oodb design. In Proceedings of the 5th Intematinal Workshop on Database Programming Languages (DBPL95), Umbria, Italy, August 1995.Google Scholar
  19. 14.19
    Monika Henzinger, Thomas Henzinger, and Peter Kopke. Computing simulations on finite and infinite graphs. In Proceedings of 20th Symposium on Foundations of Computer Science, pages 453–462, 1995.Google Scholar
  20. 14.20
    P. Hoogendijk. Relational programming laws in the Boom hierarchy of types. Technical Report, Eindhoven University of Technology, The Netherlands, 1994.Google Scholar
  21. 14.21
  22. 14.22
    H. Kleisli. Every standard construction is induced by a pair of adjoint functors. Proceedings of the American Mathematical Society, 16:544–546, 1965.MathSciNetMATHCrossRefGoogle Scholar
  23. 14.23
    K. Lellahi. Towards a characterization of bulk types. Technical Report 94–01, Université Paris 13, LIPN, 1994.Google Scholar
  24. 14.24
    K. Lellahi. Type de collection et monades. In Actes des Journées Cateégories, Algèbres, Esquisses et neo-esquisses, Caen, 1994.Google Scholar
  25. 14.25
    J. Lambek and P.J. Scott. Introduction to Higher Order Categorical Logic, volume 7 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1986.MATHGoogle Scholar
  26. 14.26
    Leonid Libkin and Limsoon Wong. Aggregate functions, conservative extension, and linear orders. In Catriel Beeri, Atsushi Ohori, and Dennis E. Shasha, editors, Proceedings of 4th International Workshop on Database Programming Languages, New York, August 1993, pages 282–294. Springer-Verlag, 1994. See also UPenn Technical Report MS-CIS-93–36.Google Scholar
  27. 14.27
    Leonid Libkin and Limsoon Wong. Some properties of query languages for bags. In Catriel Beeri, Atsushi Ohori, and Dennis E. Shasha, editors, Proceedings of 4th International Workshop on Database Programming Languages, New York, August 1993, pages 97–114. Springer-Verlag, 1994. See also UPenn Technical Report MS-CIS-93–36.Google Scholar
  28. 14.28
    Kazem Lellahi and Val Tannen. A calculus for collections and aggregates. In E. Moggi and G. Rosolini, editors, LNCS 1290: Category Theory and Computer Science, Proceedings of the 7th International Conference, CTCS’97, Santa Margherita Ligure, September 1997, pages 261–280.Google Scholar
  29. 14.29
    S. MacLane. Categories for the Working Mathematician. Springer-Verlag, 1971.Google Scholar
  30. 14.30
    Ernest G. Manes. Algebraic Theories, volume 26 of Graduate Texts in Mathematics. Springer-Verlag, Berlin, 1976.MATHCrossRefGoogle Scholar
  31. 14.31
    Eugenio Moggi. Notions of computation and monads. Information and Computation, 93:55–92, 1991.MathSciNetMATHCrossRefGoogle Scholar
  32. 14.32
    Yannis Papakonstantinou, Hector Garcia-Molina, and Jennifer Widom. Object Exchange across Heterogeneous Information Sources Proceedings of the 11th International Conference on Data Engineering, pages 251–260, 1995.Google Scholar
  33. 14.33
    Lucian Popa and Val Tannen. An equational chase for path-conjunctive queries, constraints, and views. In Proceedings of ICDT, Jerusalem, Israel, January 1999.Google Scholar
  34. 14.34
    Dan Suciu and Val Breazu-Tannen. A query language for NC. In Proceedings of 13th ACM Symposium on Principles of Database Systems, pages 167–178, Minneapolis, Minnesota, May 1994. See also UPenn Technical Report MS-CIS-94–05.Google Scholar
  35. 14.35
    G. Shaw and S. Zdonik. Object-oriented queries: Equivalence and optimization. In Proceedings of International Conference on Deductive and Object-Oriented Databases, 1989.Google Scholar
  36. 14.36
    G. Shaw and S. Zdonik. An object-oriented query algebra. In Proceedings of DBPL, Salishan Lodge, Oregon, June 1989.Google Scholar
  37. 14.37
    P.W. Trinder. Comprehensions, a query notation for DBPLs. In Proceedings of 3rd International Workshop on Database Programming Languages, Nahplion, Greece, pages 49–62. Morgan Kaufmann, August 1991.Google Scholar
  38. 14.38
  39. 14.39
  40. 14.40
    Philip Wadler. Comprehending monads. Mathematical Structures in Computer Science, 2:461–493, 1992.MathSciNetMATHCrossRefGoogle Scholar
  41. 14.41
    Limsoon Wong. Normal forms and conservative properties for query languages over collection types. In Proceedings of 12th ACM Symposium on Principles of Database Systems, pages 26–36, Washington, DC, May 1993. See also UPenn Technical Report MS-CIS-92–59.Google Scholar
  42. 14.42
    Limsoon Wong. Querying nested collections. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, August 1994. Available as University of Pennsylvania IRCS Report 94–09.Google Scholar
  43. 14.43
    David A. Watt and Phil Trinder. Towards a theory of bulk types. Fide Technical Report 91/26, Glasgow University, July 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Peter Buneman
    • 1
  • Val Tannen
    • 2
  1. 1.School of InformaticsUniversity of EdinburghEdinburghIreland
  2. 2.Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations