Skip to main content

Using symbol clustering to improve probabilistic automaton inference

  • Conference paper
  • First Online:
Grammatical Inference (ICGI 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1433))

Included in the following conference series:

Abstract

In this paper we show that clustering alphabet symbols before PDFA inference is performed reduces perplexity on new data. This result is especially important in real tasks, such as spoken language interfaces, in which data sparseness is a significant issue. We describe the application of the ALERGIA algorithm combined with an independent clustering technique to the Air Travel Information System (ATIS) task. A 25 % reduction in perplexity was obtained. This result outperforms a trigram model under the same simple smoothing scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Abe and M. Warmuth. On the computational complexity of approximating distributions by probabilistic automata. Machine Learning, 9:205–260, 1992.

    MATH  Google Scholar 

  2. P. Brown, V. Della Pietra, P. de Souza, J. Lai, and R. Mercer. Class-based N-gram models of natural language. Computational Linguistics, 18(4):467–479, 1992.

    Google Scholar 

  3. R. Carrasco and J. Oncina. Learning stochastic regular grammars by means of a state merging method. In Grammatical Inference and Applications, ICGI'94,number 862 in Lecture Notes in Artificial Intelligence, pages 139–150. SpringerVerlag, 1994.

    Google Scholar 

  4. L. Hirschman. Multi-site data collection for a spoken language corpus. In Proc. of DARPA Speech and Natural Language Workshop, pages 7–14, 1992.

    Google Scholar 

  5. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963.

    Article  MATH  MathSciNet  Google Scholar 

  6. S.M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustic, Speech and Signal Processing, 35(3):400–401, 1987.

    Article  Google Scholar 

  7. M.J. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R.E. Schapire, and L. Sellie. On the learnability of discrete distributions. In Proc. of the 25th Annual ACM Symposium on Theory of Computing, pages 273–282, 1994.

    Google Scholar 

  8. R. Kneser and H. Ney. Improved backing-off for m-gram language modeling. In International Conference on Acoustic, Speech and Signal Processing, pages 181–184, 1995.

    Google Scholar 

  9. K. Lang. Merge order counts. Technical report, NEC Research Institute, September 1997.

    Google Scholar 

  10. K.J. Lang. Random DFA's can be approximately learned from sparse uniform examples. In 5th ACM workshop on Computational Learning Theory, pages 45–52, 1992.

    Google Scholar 

  11. H. Ney, U. Essen, and R. Kneser. On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language, 8:1–38, 1994.

    Article  Google Scholar 

  12. H. Ney and R. Knesser. Improved clustering techniques for class-based statistical language modelling. In European Conference on Speech Communication and Technology, pages 973–976, Berlin, 1993.

    Google Scholar 

  13. J. Oncina and P. García. Inferring regular languages in polynomial update time. In N. Pérez de la Bianca, A. Sanfeliu, and E.Vidal, editors, Pattern Recognition and Image Analysis, volume 1 of Series in Machine Perception and Artificial Intelligence, pages 49–61. World Scientific, 1992.

    Google Scholar 

  14. D. Ron, Y. Singer, and N. Tishby. On the learnability and usage of acyclic probabilistic automata. to appear in Journal of Computer and System Sciences.

    Google Scholar 

  15. H. Rulot and E. Vidal. An efficient algorithm for the inference of circuit-free automata. In G. Ferratè, T. Pavlidis, A. Sanfeliu, and H. Bunke, editors, Advances in Structural and Syntactic Pattern Recognition, pages 173–184. NATO ASI, Springer-Verlag, 1988.

    Google Scholar 

  16. B. Trakhtenbrot and Ya. Barzdin. Finite Automata: Behavior and Synthesis. North Holland Pub. Comp., Amsterdam, 1973.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Vasant Honavar Giora Slutzki

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dupont, P., Chase, L. (1998). Using symbol clustering to improve probabilistic automaton inference. In: Honavar, V., Slutzki, G. (eds) Grammatical Inference. ICGI 1998. Lecture Notes in Computer Science, vol 1433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054079

Download citation

  • DOI: https://doi.org/10.1007/BFb0054079

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64776-8

  • Online ISBN: 978-3-540-68707-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics