Skip to main content

Very Predictive Ngrams for Space-Limited Probabilistic Models

  • Conference paper
  • 1654 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Abstract

In sequential prediction tasks, one repeatedly tries to predict the next element in a sequence. A classical way to solve these problems is to fit an order-n Markov model to the data, but fixed-order models are often bigger than they need to be. In a fixed-order model, all predictors are of length n, even if a shorter predictor would work just as well. We present a greedy algorithm, vpr, for finding variable-length predictive rules. Although vpr is not optimal, we show that on English text, it performs similarly to fixed-order models but uses fewer parameters.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Bejerano, G.: Optimal amnesic probabilistic automata or how to learn and classify proteins in linear time and space. In: RECOMB, pp. 25–32 (2000)

    Google Scholar 

  2. Bejerano, G., Yona, G.: Variations on probabilistic suffix trees: statistical modeling and the prediction of protein families. Bioinformatics 17(1), 23–43 (2001)

    Article  Google Scholar 

  3. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Chichester (2001)

    MATH  Google Scholar 

  4. Laird, P., Saul, R.: Discrete sequence prediction and its applications. Machine Learning 15, 43–68 (1994)

    MATH  Google Scholar 

  5. Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–285 (1989)

    Article  Google Scholar 

  6. Ron, D., Singer, Y., Tishby, N.: The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning 25(2-3), 117–149 (1996)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cohen, P.R., Sutton, C.A. (2003). Very Predictive Ngrams for Space-Limited Probabilistic Models. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45231-7_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40813-0

  • Online ISBN: 978-3-540-45231-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics