Skip to main content

Hierarchical Encoding of Text: Technical Problems and SGML Solutions

  • Chapter
Text Encoding Initiative

Abstract

One recurring theme in the TEI project has been the need to represent non-hierarchical information in a natural way — or at least in a way that is acceptable to those who must use it — using a technical tool that assumes a single hierarchical representation. This paper proposes solutions to a variety of such problems: the encoding of segments which do not reflect a document’s primary hierarchy; relationships among non-adjacent segments of texts; ambiguous content; overlapping structures; parallel structures; cross-references; vague locations.

David T. Barnard is Professor of Computing and Information Science at Queen’s University. His research interests are in structured text processing and the compilation of programming languages. His recent publications include “Tree-to-tree Correction for Document Trees”, Queen’s Technical Report, and “Error Handling in a Parallel LR Substring Parser”, Computer Languages, 19,4 (1993) 247–59.

Lou Burnard is Director of the Oxford Text Archive at Oxford University Computing Services, with interests in electronic text and database technology. He is European Editor of the Text Encoding Initiative’s Guidelines.

Jean-Pierre Gaspart is with Associated Consultants and Software Engineers.

Lynne A. Price (Ph.D., computer sciences, University of Wisconsin-Madison) is a senior software engineer at Frame Technology Corp. Her main area of research has been representing text structure for automatic processing. She has served on both the US and international SGML standards committee for several years and is the editor of International Standard ISO/IEC 13673 on Conformance Testing for Standard Generalized Markup Language (SGML) Systems.

C. M. Sperberg-McQueen is a Senior Research Programmer at the academic computer center of the University of Illinois at Chicago; his interests include medieval Germanic languages and literatures and the theory of electronic text markup. Since 1988 he has been editor in chief of the ACH/ACL/ALLC Text Encoding Initiative.

Giovanni Battista Varile works for the Commission of the European Communities.

This paper is derived from a working paper of the Metalanguage Committee entitled “Notes on SGML Solutions to Markup Problems” which was produced following a meeting of the committee in Luxembourg. The co-authors all participated in that meeting and provided input to this paper. Others serving on the committee at other times included David Durand (Boston University), Nancy Ide (Vassar College) and Frank Tompa (University of Waterloo).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

  1. Or more elaborately, the passage marked by straight lines in the left margin, the passage marked by wavy lines in the left margin, the passage underlined by hand with a simple straight line, the passage underlined by hand with a simple straight line which was later deleted by hand, etc., as in the transcriptions of Wittgenstein’s manuscripts in the Norwegian Wittgenstein project. See Claus Huitfeldt and Viggo Rossvoer, The Norwegian Wittgenstein Project Report 1988 ([Bergen]: NAVFs EDB-Senter for Humanistisk Forskning/Norwegian Centre for the Humanities, 1989), especially, pp. 201–236.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Barnard, D.T., Burnard, L., Gaspart, JP., Price, L.A., Sperberg-McQueen, C.M., Varile, G.B. (1995). Hierarchical Encoding of Text: Technical Problems and SGML Solutions. In: Ide, N., Véronis, J. (eds) Text Encoding Initiative. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-0325-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-0325-1_17

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-0-7923-3704-1

  • Online ISBN: 978-94-011-0325-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics