Abstract
This paper describes LE-SR (Learning English Syllabification Rules), the first machine learning program that learns English syllabification rules, i.e., rules that tell how to divide English words into syllables for pronunciation. LE-SR uses a unique knowledge representation called C-S-CL-SS which effectively generalizes English graphemes. Given a 20,000 on-line pronouncing dictionary, LE-SR learned 423 syllabification rules from 90% of instances that have a predictive accuracy of 90.35% on the unseen 10% instances.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Allen, J., Hunnicutt, S., and Klatt, D., editors (1987). From Text to Speech: The MITalk System. Cambridge University Press, London.
Dietterich, T. (1997).CS534 programming assignment 5. http://www.cs.orst.edu:80/ tgd/classes/534/programs/prog5/prog5.html.
Elovitz, H., Johnson, R., Mchugh, A., and Shore, J. (1976). Automatic translation of English text to phonetics by means of letter-to-sound rules. Technical Report NRL 7948, Naval Research Laboratory, Washington, D.C.
Hamilton, H. J. and Zhang, J. (1996). The iterated version space algorithm. In Proc. of Ninth Florida Artificial Intelligence Research Symposium (FLAIRS-96), pages 209–213, Daytona Beach, Florida.
Hochberg, J., Mniszewski, S., Calleja, T., and Papcun, G. (1991). A default hierarchy for pronouncing English. IEEE Transactions on Pattern Analysis and Machine Intellegence, 13(9):957–964.
Klatt, D. (1982). The Klattalk text-to-speech system. In Proc. Int. Conf. Acoustics Speech Signal Processing, pages 1589–1592.
Klatt, D. (1987). How KLATTalk became DECtalk: An academic's experience in the business world. In Official Proceedings Speech Tech'87. Voice Input/ Output Applications Show and Conference, pages 293–294.
Kreidler, C. W. (1989). Pronunciation of English. Basil Blackwell, Oxford, UK.
Ladeforged, P. (1982). A Course in Phonetics. Harcourt Brace Jovanovich, New York.
Ling, C. and Wang, H. (1995). A decision-tree model for reading aloud. http://www.csd.uwo.ca/faculty/ling/sub-pub.html.
MacKay, I. R., editor (1987). Phonetics: the Science of Speech Production. Pro-Ed, Austin, Texas.
Mudambi, S. and Schimpf, J. (1994). Parallel CLP on heterogeneous networks. Technical Report ECRC-94-17, European Computer-Industry Research Centre GmbH, Munich, Germany.
O'Grady, W. and Dobrovolsky, M. (1992). Contemporary Linguistic Analysis. Copp Clark Pitman, Toronto.
Sejnowski, T. and Rosenberg, C. (1987). Parallel networks that learn to pronounce English text. Complex Systems, 1:145–168.
Sejnowski, T. and Rosenberg, C. (1988). NETtalk corpus, (am6.tar.z). ftp.cognet.ucla.edu in pub/alexis.
Zhang, J. and Hamilton, H. (1996). The LEP learning system. In International Conference on Natural Language Processing and Industrial Applications, pages 293–297, Moncton, New Brunswick, Canada.
Zhang, J. and Hamilton, H. (1997). Learning English syllabification for words. In Proc. of Tenth International Symposium on Methodologies for Intelligent Systems, pages 177–186, Charlotte, North Carolina.
Zhang, J., Hamilton, H., and Galloway, B. (September, 1997). English graphemes and their pronunciations. In Proceedings of Pacific Association for Computational Linguistics, pages 351–362, Ohme, Japan.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, J., Hamilton, H.J. (1998). Learning english syllabification rules. In: Mercer, R.E., Neufeld, E. (eds) Advances in Artificial Intelligence. Canadian AI 1998. Lecture Notes in Computer Science, vol 1418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64575-6_55
Download citation
DOI: https://doi.org/10.1007/3-540-64575-6_55
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64575-7
Online ISBN: 978-3-540-69349-9
eBook Packages: Springer Book Archive