Abstract
A system that records daily conversations is one of the most useful types of lifelogs. It is, however, not widely used due to the low precision of speech recognizers when applied to conversations. To solve this problem, we propose a method that uses a topic model to reduce incorrectly recognized words. Specifically, we measure relevancy between a term and the other words in the conversation and remove those that come below the threshold. An audio lifelog search system was implemented using the method. Experiments showed that our method is effective in compensating recognition errors of speech recognizers. We observed increase in both precision and recall. The results indicate that our method has an ability to reduce errors in the index of a lifelog search system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sellen, A., Whittaker, S.: Beyond total capture: a constructive critique of lifelogging. Communications of the ACM 53(5), 70–77 (2010)
Rabiner, L., Juang, B.H.: Fundamentals of speech recognition. Prentice Hall, Englewood Cliffs (1993)
Ney, H., Ortmanns, S.: Dynamic Programming Search for Continuous Speech Recognition Contents. IEEE Signal Processing Magazine 16, 64–83 (1999)
Holmes, J., Holmes, W.: Speech synthesis and recognition. Taylor & Francis, Abington (2001)
Bellegarda, J.R.: Exploiting latent semantic information in statistical language modeling. Proc. of the IEEE 88(8), 1279–1296 (2000)
Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Communication 42, 93–108 (2004)
Wick, M.L., Ross, M.G., Learned-Miller, E.G.: Context-Sensitive Error Correction: Using Topic Models to Improve OCR. In: Proc. of the 9th International Conference on Document Extraction and Analysis, pp. 1168–1172 (September 2007)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. Proc. of the National Academy of Sciences of the United States of America 101, 5228–5235 (2004)
Heinrich, G.: Parameter estimation for text analysis, Technical Note, ver 2.4 (2008), http://www.arbylon.net/publications/text-est.pdf
Wikipedia, http://wikipedia.org
Julius - Open-Source Large Vocabulary CSR Engine, http://julius.sourceforge.jp/en_index.php
The Corpus of Spontaneous Japanese (CSJ Corpus), http://www.kokken.go.jp/katsudo/seika/corpus/public/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tezuka, T., Maeda, A. (2011). Audio Lifelog Search System Using a Topic Model for Reducing Recognition Errors. In: Yu, J.X., Kim, M.H., Unland, R. (eds) Database Systems for Advanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20152-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-20152-3_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20151-6
Online ISBN: 978-3-642-20152-3
eBook Packages: Computer ScienceComputer Science (R0)