Abstract
In many domains, automated speech recognition (ASR) demands highly robust and accurate recognition software. Unfortunately, in such domains, even a 99% accurate recognizer is inadequate, and other methods for increasing the reliability and performance of ASR must be considered. As a possible solution to this problem, post-speech-recognition error detection can assist in proofreading more efficiently. To this end, we have developed a multi-heuristic algorithm using natural language processing to detect recognition errors. As a proof of concept, we have applied this algorithm to the radiology domain. The results are encouraging, showing a 22% increase in the recall performance, and a 6% increase in the precision performance, over the best individual technique.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Horii, S., Redfern, R., Kundel, H., Nodine, C.: PACS technologies and reliability: Are we making things better or worse? In: Proceedings of SPIE, vol. 4685, pp. 16–24 (2002)
Mehta, A., Dreyer, K., Schweitzer, A., Couris, J., Rosenthal, D.: Voice recognition – an emerging necessity within radiology: Experiences of the massachusetts general hospital. Journal of Digital Imaging 11(4), 20–23 (1998)
Jeong, M., Kim, B., Lee, G.: Using higher-level linguistic knowledge for speech recognition error correction in a spoken Q/A dialog. In: Proceedings of the HLT-NAACL special workshop on Higher-Level Linguistic Information for Speech Processing, pp. 48–55 (2004)
Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice-Hall, Englewood Cliffs (2000)
Allen, J.F., Miller, B.W., Ringger, E.K., Sikorski, T.: A robust system for natural spoken dialogue. In: Proceedings of the 34th Annual Meeting of the ACL, pp. 62–70 (1996)
Kaki, S., Sumita, E., Iida, H.: A method for correcting errors in speech recognition using the statistical features of character co-occurrence. In: ACL-COLING, pp. 653–657 (1998)
Kukich, K.: Techniques for automatically correcting words in text. ACM Computing Surveys 24(4), 377–439 (1992)
Ringger, E.K., Allen, J.F.: A fertility model for post correction of continuous speech recognition. In: ICSLP’96, pp. 897–900 (1996)
Gillick, L., Ito, Y., Young, J.: A probabilistic approach to confidence measure estimation and evaluation. In: Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing, April 1997, pp. 879–882. IEEE Computer Society Press, Los Alamitos (1997)
Sarma, A., Palmer, D.: Context-based speech recognition error detection and correction. In: Proceedings of the HLT-NAACL 2004, pp. 85–88 (2004)
Voll, K., Atkins, S., Forster, B.: Improving the utility of speech recognition through error detection. In: SCAR Annual Meeting, in press (2006)
Voll, K.: A Methodology of Error Detection: Improving Speech Recognition in Radiology. PhD thesis, Simon Fraser University, School of Computing Science, Burnaby, BC, Canada (2006)
Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge (2002)
Inkpen, D., Désilets, A.: Semantic similarity for detecting recognition errors in automatic speech transcripts. In: Proceedings of EMNLP, Vancouver, British Columbia, Canada, October 2005, pp. 49–56. Association for Computational Linguistics (2005), http://www.aclweb.org/anthology/H/H05/H05-1007
Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Christiansen, H.: CHR grammars. Theory and Practice of Logic Programming 5(4), 467–501 (2005)
Blache, P.: Property grammars: A fully constraint-based theory. In: Christiansen, H., Skadhauge, P.R., Villadsen, J. (eds.) CSLP 2005. LNCS (LNAI), vol. 3438, pp. 1–16. Springer, Heidelberg (2005)
Dahl, V., Voll, K.: Concept formation rules: An executable cognitive model of knowledge construction. In: Proceedings of the First International Workshop on Natural Language Understanding and Cognitive Sciences, Porto, Portugal, April 2004, pp. 28–36 (2004)
Caviedes, J.E., Cimino, J.J.: Towards the development of a conceptual distance metric for the UMLS. Journal of Biomedical Informatics 37, 77–85 (2004)
Shiffman, S., Detmer, W.M.S., Lane, C.D., Fagan, L.M.: A continuous-speech interface to a decision support system: I. Techniques to accommodate misrecognized input. AMIA 2, 36–45 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Voll, K. (2007). A Hybrid Approach to Improving Automatic Speech Recognition Via NLP. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-72665-4_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72664-7
Online ISBN: 978-3-540-72665-4
eBook Packages: Computer ScienceComputer Science (R0)