Skip to main content

Document Databases

  • Reference work entry
  • First Online:
  • 65 Accesses

Synonyms

Corpora; Document repositories; Text databases

Definition

A document database is a collection of stored texts managed by a system that provides query and update facilities. Usually the database includes many documents related by their subject matter, origin, or applicability to an enterprise. The content of each document may be free text, semi-structured text including a few well-identified fields (e.g., title, author, date), or highly structured tagged text such as might be encoded using XML. Occasionally documents may also contain multimedia components.

In contrast, the term corpus (plural corpora) typically refers to a static collection of texts that have been assembled by experts to study linguistic phenomena (e.g., the Brown Corpus, created in 1964 to study American English, and the Swedish Language Bank) or to provide a rich source of text for lexicographic needs (e.g., the Dictionary of Old English Corpus, including all extant texts written in Old English in the period...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Bertino E, Ooi B, Sacks-Davis R, Tan K-L, Zobel J. Text databases. In: Indexing techniques for advanced database systems. Norwell: Kluwer; 1997. p. 151–84.

    Chapter  MATH  Google Scholar 

  2. Chin AG, editor. Text databases and document management: theory and practice. Hershey: Idea Group; 2001.

    Google Scholar 

  3. Christophides V, Abiteboul S, Cluet S, Scholl M. From structured documents to novel query facilities. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1994. p. 313–24.

    Article  Google Scholar 

  4. Kilpeläinen P, Lindén G, Mannila H, Nikunen E. A structured text database system. In: Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography; 1990. p. 139–51.

    Google Scholar 

  5. Loeffen A. Text databases: a survey of text models and systems. ACM SIGMOD Rec. 1994;23(1):97–106.

    Article  Google Scholar 

  6. Lowe B, Zobel J, Sacks-Davis R. A formal model for databases of structured text. In: Proceedings of the 4th International Conference on Database Systems for Advanced Applications; 1995. p. 449–56.

    Google Scholar 

  7. Macleod A. A data base management system for document retrieval applications. Inf Syst. 1981;6(2):131–7.

    Article  Google Scholar 

  8. Sacks-Davis R, Arnold-Moore T, Zobel J. Database systems for structured documents. In: Proceedings of the International Symposium on Advanced Database Technologies and Their Integration; 1994. p. 272–83.

    Google Scholar 

  9. Salminen A, Tompa FW. Requirements for XML document database systems. In: Proceedings of the ACM Symposium on Document Engineering; 2001. p. 85–94.

    Google Scholar 

  10. Stonebraker M, Stettner H, Lynn N, Kalash J, Guttman A. Document processing in a relational database system. ACM Trans Inf Syst. 1983;1(2):143–58.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Tompa .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Tompa, F. (2018). Document Databases. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_807

Download citation

Publish with us

Policies and ethics