Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Semi-structured Data Model

  • Dan SuciuEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_337


Semi-structured data


The semi-structured data model is designed as an evolution of the relational data model that allows the representation of data with a flexible structure. Some items may have missing attributes, others may have extra attributes, some items may have two or more occurrences of the same attribute. The type of an attribute is also flexible: it may be an atomic value, or it may be another record or collection. Moreover, collections may be heterogeneous, i.e., they may contain items with different structures. The semi-structured data model is self-describing data model, in which the data values and the schema components co-exist. Formally:

Definition 1

A semi-structured data instance is a rooted, directed graph in which the edges carry labels representing schema components, and leaf nodes (i.e., nodes without any outgoing edges) are labeled with data values (integers, reals, strings, etc.).

There are two variations of semi-structured data, depending on...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Buneman P, Davidson S, Suciu D. Programming constructs for unstructured data. In: Proceedings of the 5th International Workshop on Database Programming Languages; 1995.Google Scholar
  2. 2.
    Buneman P, Davidson S, Hillebrand G, Suciu D. A query language and optimization techniques for unstructured data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1996. p. 505–16.Google Scholar
  3. 3.
    Buneman P, Fernandez M, Suciu D. UNQL: a query language and algebra for semistructured data based on structural recursion. VLDB J. 2000;9(1):76–110.CrossRefGoogle Scholar
  4. 4.
    Deutsch A, Fernandez M, Florescu D, Levy A, Suciu D. A query language for XML. In: Proceedings of the 8th International World Wide Web Conference; 1999. p. 77–91.CrossRefGoogle Scholar
  5. 5.
    Garcia-Molina H, Papakonstantinou Y, Quass D, Rajaraman A, Sagiv Y, Ullman J, Widom J. The TSIMMIS project: integration of heterogeneous information sources. J Intell Inf Syst. 1997;8(2):117–32.CrossRefGoogle Scholar
  6. 6.
    Luniewski A, Schwarz P, Shoens K, Stamos J, Thomas J. Information organization using Rufus. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1993. p. 560–1.CrossRefGoogle Scholar
  7. 7.
    Paige R, Tarjan R. Three partition refinement algorithms. SIAM J. Comput. 1987;16(6):973–88.MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Papakonstantinou Y, Garcia-Molina H, Widom J. Object exchange across heterogeneous information sources. In: Proceedings of the 11th International Conference on Data Engineering; 1995. p. 251–60.Google Scholar
  9. 9.
    Shoens K, Luniewski A, Schwarz P, Stamos J, Thomas II J. The Rufus system: information organization for semi-structured data. In: Proceedings of the 19th International Conference on Very Large Data Bases; 1993. p. 97–107.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA

Section editors and affiliations

  • Frank Tompa
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUniv. of WaterlooWaterlooCanada