Skip to main content

Assembling documents from digital libraries

  • Digital Libraries
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1308))

Abstract

We consider assembling documents using, as a source, a digital library containing SGML documents. The assembly process contains two parts: 1) finding interesting fragments, and 2) constructing a coherent document. We present a general document assembly framework. First, we describe a system for tailoring control engineering textbooks. Its assembling facilities are rather restricted but, on the other hand, the quality of documents produced is high. Second, we address the problem of filtering and combining interesting information from a large heterogeneous document collection. The methods presented offer various ways to find the interesting document fragments. Moreover, the elements found in the fragments are mapped to generic elements, like sections, paragraph containers, paragraphs and strings, which have known semantics. Hence, even arbitrary compositions can be formatted and printed.

This work was supported by the Finnish Technology Development Centre (TEKES).

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Helena Ahonen, Barbara Heikkinen, Oskari Heinonen, Jam Jaakkola, Pekka Kilpeläinen, Greger Lindén, and Heikki Mannila. Intelligent Assembly of Structured Documents. Report C-1996-40, Department of Computer Science, University of Helsinki, 1996.

    Google Scholar 

  2. Helena Ahonen, Barbara Heikkinen, Oskari Heinonen, and Mika Klemettinen. Improving the accessibility of SGML documents: A content-analytical approach. In SGML Europe '97, Barcelona, 1997. GCA.

    Google Scholar 

  3. Custom CourseWare. McMaster University Bookstore, 1997.URL: http://bookstore.services.mcmaster.ca/home/ccw/ccw.html.

    Google Scholar 

  4. Douglas R. Cutting, Jan O. Pedersen, David Karger, and John W. Tukey. Scatter/Gather: A cluster-based approach to browsing large document collections. In Proc. of the 15th ACMISIGIR Conference, Copenhagen, 1992.

    Google Scholar 

  5. Anja Haake, Christoph Hüser, and Klaus Reichenberger. The individualized electronic newspaper: an example of an active publication. Electronic Publishing–Origination, Dissemination and Design, 7(2):89–111, June 1994.

    Google Scholar 

  6. ISO. Information Processing — Text and Office Systems — Standard Generalized Markup Language (SGML), ISO 8879, 1986.

    Google Scholar 

  7. ISO. Information and documentation — Electronic manuscript preparation and markup, ISO 12083, 1994.

    Google Scholar 

  8. W. Eliot Kimber. Re-usable SGML: Why I demand SUBDOC. In SGML '96, Boston, 1996. GCA.

    Google Scholar 

  9. John McFadden. Hybrid distributed database (HDDB) and the future of SGML. In SGML Europe '96, Munich, 1996. GCA.

    Google Scholar 

  10. Nelson Canada Power Pak. Nelson Canada, a Division of Thomson International, 1997. URL:http://www.thomson.com/nelson/custom/custom.html.

    Google Scholar 

  11. Primis. Primis Custom Publishing, a Division of McGraw-Hill, 1997. URL: http://www.mhcollege.com/primis/.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Abdelkader Hameurlain A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ahonen, H., Heikkinen, B., Heinonen, O., Kilpeläinen, P. (1997). Assembling documents from digital libraries. In: Hameurlain, A., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 1997. Lecture Notes in Computer Science, vol 1308. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0022051

Download citation

  • DOI: https://doi.org/10.1007/BFb0022051

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63478-2

  • Online ISBN: 978-3-540-69580-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics