Abstract
In this paper, an important question, whether a small language model can be practically accurate enough, is raised. Afterwards, the purpose of a language model, the problems that a language model faces, and the factors that affect the performance of a language model, are analyzed. Finally, a novel method for language model compression is proposed, which makes the large language model usable for applications in handheld devices, such as mobiles, smart phones, personal digital assistants (PDAs), and handheld personal computers (HPCs). In the proposed language model compression method, three aspects are included. First, the language model parameters are analyzed and a criterion based on the importance measure ofn-grams is used to determine whichn-grams should be kept and which removed. Second, a piecewise linear warping method is proposed to be used to compress the uni-gram count values in the full language model. And third, a rank-based quantization method is adopted to quantize the bi-gram probability values. Experiments show that by using this compression method the language model can be reduced dramatically to only about 1M bytes while the performance almost does not decrease. This provides good evidence that a language model compressed by means of a well-designed compression technique is practically accurate enough, and it makes the language model usable in handheld devices.
Similar content being viewed by others
References
Jelinek F, Mercer R L. Interpolated estimation of Markov source parameters from sparse data. InPattern Recognition in Practice, Gelsema E S, Kanal L N (eds.), Amsterdam, North-Holland, 1986.
Di S, Zhang L, Chen Zet al. N-gram language model compression using scalar quantization and incremental coding. InInternational Symp. Chinesc Spoken Language Processing (ISCSLP’2000), Beijing, China, 2000, pp. 347–350.
Goodman J. Language model size reduction by pruning and clustering. InInt. Conf. Spoken Language Processing (ICSLP’2000), Beijing, China, 2000.
Whittaker E, Raj B. Quantizationbased language model compression.EuroSpeech, Aalborg, Denmark, 2001, pp. 33–36.
Jelinek F. Self organized language modeling for speech recognition. In Readings in Speech Recognition, Waibel A, Lee K F (eds.), Morgan Kaufmann, 1990.
Zheng F, Wu J, Song Z J. Improving the syllablesynchronous network search algorithm for word decoding in continuous Chinese speech recognition.J. Computer Science & Technology, Sept. 2000, 15(5): 461–471.
Seymore K, Rosenfeld R. Scalable backoff language models. InInt. Conf. Spoken Language Processing (ICSLP’1996), Vol. 1, Philadelphia, 1996, pp. 232–235.
Stolcke A. Entropy-based pruning of backoff language models. InProc. DARPA News Transcription and Understanding Workshop, Lansdowne, VA, 1998, pp. 270–274.
Yan P J, Zheng F, Xu M Xet al. Word-class stochastic model in a spoken language dialogue system. InInt. Symp. Chinese Spoken Language Processing (ISCSLP’2000), Beijing, Oct. 13–15, 2000, pp. 141–144.
Niesler T R, Woodland P C. Variable-length categorybasedn-grams for language modeling. Technical Report, Cambridge University, UK, April 1995.
Katz S M. Estimation of probabilities from sparse data for the language model component of a speech recognizer. InInt. Conf. Acoustics, Speech and Signal Processing (ICASSP 1987), 1987, 35(3): 400–401.
Wu G Q, Zheng F, Wu W Het al. Improved Katz smoothing for language modeling in speech recognition. InInt. Conf. Spoken Language Processing (ICSLP’2002), Vol.2, Denver, 2002, pp. 925–928.
Wu G Q, Zheng F, Jin L, Wu W H. An online incremental language model adaptation method.EuroSpeech, Aalborg, Denmark, Sept. 3–7, 2001, 3: 2139–2142.
Zipf G K. Selective studies and the principle of relative frequency in language. Harvard University Press, Cambridge, MA, 1932.
Zheng F. A syllable-synchronous network search algorithm for word decoding in Chinese speech recognition. InIEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP’1999), Phoenix, USA, March 15–19. 1999, pp. II-601–604.
Author information
Authors and Affiliations
Corresponding author
Additional information
The author currently is also with Beijing d-Ear Technologies Co., Ltd., fzeng@d-Ear.com.
WU GenQing is currently a Ph.D. candidate of Center of Speech Technology, the State Laboratory of Intelligent Technology and Systems, Tsinghua University. He received his B.S. degree in computer science and technology from the Department of Computer Science and Technology, Tsinghua University, in 1999. He is now focusing on language modeling for speech recognition. His current research interests include language modeling, language model adaptation and language model compression techniques.
ZHENG Fang is currently an associate professor of Tsinghua University. He is the Director of Center of Speech Technology, State Laboratory of Intelligent Technology and Systems. Dr. Zheng graduated from the Department of Computer Science and Technology of Tsinghua University and received his B.S., M.S. and Ph.D. degrees from Tsinghua University, in 1990, 1992 and 1997 respectively. He has been working in speech recognition and understanding at the Department of Computer Science and Technology, Tsinghua University, since 1988, and now is with the State Key Laboratory of Intelligent Technology and Systems. He has published over 110 technical papers on acoustic/language modeling, isolated/continuous speech recognition, keyword spotting, dictating, language understanding, and so on. He is now anIEEE member, an ISCA member, a member of the Artificial Intelligence and Pattern Recognition Technical Commission of China Computer Federation, a member of the Editorial Committee of theJournal of Chinese Information Processing, and a key member of Oriental-COCOSDA. He is serving as a reviewer of several domestic and international journals. Recently, he has been the General Chair of Oriental-COCOSDA’2003, a member of the Scientific Committee of ISCA Tutorial and Research Workshop (ITR-Workshop) on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology 2002, and a member of the Technical Committee and International Advisory Committee of the Joint International Conference of the Fifth Symposium on Natural Language Processing (CNLP) and ’2002 Oriental COCOSDA Workshop (SNLP-O-COCOSDA).
Rights and permissions
About this article
Cite this article
Wu, G., Zheng, F. A method to build a super small but practically accurate language model for handheld devices. J. Comput. Sci. & Technol. 18, 747–755 (2003). https://doi.org/10.1007/BF02945463
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02945463