Skip to main content

A Probabilistic Line Breaking Algorithm

  • Conference paper
AI 2003: Advances in Artificial Intelligence (AI 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2903))

Included in the following conference series:

Abstract

We show how a probabilistic interpretation of an ill defined problem, the problem of finding line breaks in a paragraph, can lead to an efficient new algorithm that performs well. The graphical model that results from the probabilistic interpretation has the advantage that it is easy to tune due to the probabilistic approach. Furthermore, the algorithm optimizes the probability a break up is acceptable over the whole paragraph, it does not show threshold effects and it allows for easy incorporation of subtle typographical rules. Thanks to the architecture of the Bayesian network, the algorithm is linear in the number of characters in a paragraph. Empirical evidence suggests that this algorithm performs closer to results published through desk top publishing than a number of existing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fine, J.: Line breaking and page breaking. TUGBoat 21(3), 210–221 (2000)

    MathSciNet  Google Scholar 

  2. Freytag, A.: Line Breaking Properties Unicode Standard Annex #14 (part of the Unicode Standard). Technical Report (2002)

    Google Scholar 

  3. Knuth, D.E.: Computers & Typesetting Volume A, The TeXbook. The TeXbook, vol. A. Addison-Wesley, Reading (1984)

    Google Scholar 

  4. Knuth, D.E., Plass, M.F.: Breaking Paragraphs into Lines. Software—Practice and Experience 11, 1119–1184 (1981)

    Article  MATH  Google Scholar 

  5. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems (with discussion). Journal of the Royal Statistical Society B 50, 157–224 (1988)

    MATH  MathSciNet  Google Scholar 

  6. Liang, F.M.: Word Hy-phen-a-tion by Computer. Ph.D. Thesis, Department of Computer Science, Stanford University (August 1983)

    Google Scholar 

  7. Pearl, J.: Probabilistic Reasoning in Intelligent Systems, Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  8. Extensible Stylesheet Language (XSL). Version 1.0, W3C Recommendation, October 15 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bouckaert, R.R. (2003). A Probabilistic Line Breaking Algorithm. In: Gedeon, T.(.D., Fung, L.C.C. (eds) AI 2003: Advances in Artificial Intelligence. AI 2003. Lecture Notes in Computer Science(), vol 2903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24581-0_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24581-0_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20646-0

  • Online ISBN: 978-3-540-24581-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics