Radiological reporting based on voice recognition
Speech recognition has proved to be a natural interaction modality and an effective technology for medical reporting, in particular in the speciality of radiology. High-volume text creation requirement and the complex structure of these texts make voice technologies useful. By employing speech, professionals in the field can generate reports and do so at a speed that approaches traditional dictation methods.
However, the integration of speech recognition in a user interface creates new problems: speech recognizers may introduce errors and moreover they should be adaptable to spoken language variations.
This paper describes a radiological reporting system and the related motivations for the use of the speech modality. A preliminary evaluation of the system has shown that, on average, although text recalling functions and keyword shortcuts are available, more than two thirds of a radiological report are generated by means of dictation.
KeywordsRecognition Rate Speech Recognition Language Model Automatic Speech Recognition Radiological Report
Unable to display preview. Download preview PDF.
- 1.G. Antoniol, F. Brugnara, F. Dalla Palma, G. Lazzari, and E. Moser A.RE.S.: An interface for automatic reporting by speech. In Proceedings of the European Conference on Speech Communication and Technology, Genova, Italy, 1991.Google Scholar
- 2.L. R. Bahl, F. Jelinek, and R. L. Mercer. A maximum likelihood approach to continuous speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5(2):179–190, 1983.Google Scholar
- 3.L. R. Bahl, F. Jelinek, and R. L. Mercer. A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(2):179–190, March 1983.Google Scholar
- 4.J. K. Baker. Trainable Grammars for Speech Recognition. In Proceedings of the Spring Conference of the Acoustical Society of America, 1979.Google Scholar
- 5.H. Cerf-Danon, S. DeGennaro, M. Ferretti, J.Gonzalez, and E. Keppel. Tangora — a large vocabulary speech recognition system for five languages. In Proceedings of the European Conference on Speech Communication and Technology, pages 215–218, Genova, Italy, September 1991.Google Scholar
- 6.M. Grice and B. Barry. Esprit project 2589 (sam) multi-lingual speech input/output assessment, methodology and standardisation, 1985. Doc. SAM-UC-149.Google Scholar
- 7.R. Joseph. Large vocabulary voice-to-text systems for medical reporting. Speech Technology, 4(4):49–51, 1989.Google Scholar
- 8.L. F. Lamel, R. H. Kassel, and S. Seneff. Speech Database Development: Design and Analysis of the Acoustic-Phonetic Corpus. In Proceedings of the DARPA Speech Recognition Workshop, 1986.Google Scholar
- 9.J.A. Larson. Interactive software: tools for building interactive user interfaces. Prentice-Hall, Englewood Cliffs, NJ, 1992.Google Scholar
- 10.H. Ney and U. Essen. On Smoothing Techniques for Bigram-Based Natural Language Modelling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 825–828, Toronto, Canada, 1991.Google Scholar
- 11.David S. Pallett. Performance assessment of automatic speech recognizers, 1985. Journal of Research of the National Bureau of Standards.Google Scholar
- 12.A. I. Rudnicky and M. H. Sakamoto. Transcription Conventions and Evaluation Techniques for Spoken Language System Research. Technical Report 9204-11, School of Computer Science, CMU, Pittsburgh, PA, 1989.Google Scholar