Abstract
The need of large databases both for training and testing automatic speech recognition and understanding systems is a well known issue. This paper presents the result of a first collection of task-oriented spontaneous speech corpora performed in the MAIA project, under development at IRST. About 2000 sentences were acquired from 50 subjects concerning two scenarios of human-machine spoken interactions: a telecontrol station for a mobile robot and an information query system. Both systems were simulated by means of the well known “Wizard of Oz” technique. This paper focuses on the methodological issues of this approach, putting in evidence some important points which must be considered in the design of simulations, together with the adopted solutions. A first evaluation of the collected data concludes the exposition.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
L. R. Bahl, F. Jelinek, and R. L. Mercer. A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(2):179–190, March 1983.
J. K. Baker. Trainable Grammars for Speech Recognition. In Proceedings of the Spring Conference of the Acoustical Society of America, 1979.
R. Brunelli, D. Falavigna, T. Poggio, and L. Stringa. Automatic person recognition by using acoustic and geometric feature. Technical report, IRST, Trento, Italy, 1992.
B. Caprile, G. Lazzari, and L. Stringa. Autonomous navigation and speech in the mobile robot of MAIA. In Proc. of the SPIE 92 Conference, Boston, 1992. to appear.
A. Corazza, R. De Mori, R. Gretter, and G. Satta. Computation of Probabilities for a Stochastic Island-Driven Parser. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9):936–950, 1991.
N. M. Fraser and G. N. Gilbert. Simulating speech systems. Computer Speech and Language, 5(1):81–99, January 1991.
A. G. Hauptmann and A. I. Rudnicky. Talking to computers: an empirical investigation. International Journal of Man-Machine Studies, 28:583–684, 1988.
Charles T. Hemphill, John J. Godfrey, and George R. Doddington. The ATIS Spoken Language System Pilot Corpus. In Proceedings of the DARPA Speech and Natural Language Workshop, pages 96–101, Hidden Valley, Penn., June 1990.
F. Jelinek and J. D. Lafferty. Computation of the probability of initial substring generation by stochastic context free grammars. Computational Linguistics, 17(3):315–323, 1991.
L. F. Lamel, R. H. Kassel, and S. Seneff. Speech Database Development: Design and Analysis of the Acoustic-Phonetic Corpus. In Proceedings of the DARPA Speech Recognition Workshop, 1986.
J. Makhoul, F. Jelinek, L. Rabiner, C. Weinstein, and V. Zue. White paper on spoken language systems. In Proceedings of Speech and Natural Language Workshop, pages 463–479, Cape Cod, Ma, USA, 1989.
T. Poggio and L. Stringa. A Project for an Intelligent System: Vision and Learning. International Journal of Quantum Chemistry, 42(727), 1992.
A. I. Rudnicky and M. H. Sakamoto. Transcription Conventions and Evaluation Techniques for Spoken Language System Research. Technical Report 9204-11, School of Computer Science, CMU, Pittsburgh, PA, 1989.
G. Satta and O. Stock. Formal properties and implementation of bidirectional charts. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1480–1485, Detroit, MI, 1989.
G. Satta and O. Stock. Bi-Directional Context-Free Grammar Parsing for Natural Language Processing. Technical Report 9204–11, IRST, Trento, Italy, 1992. Also as Tech. Rep. IRCS-92-13, University of Pennsylvania, Philadelphia, PA.
O. Stock. A Third Modality of Natural Language? In Proceedings of the European Conference on Artificial Intelligence, pages 853–862, Vienna, Austria, August 1992.
O. Stock, R. Falcone, and P. Insinnamo. Bidirectional Chart: A Potential Technique for Parsing Spoken Natural Language Sentences. Computer Speech and Language, 3(3):219–23, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Corazza, A., Federico, M., Gretter, R., Lazzari, G. (1993). Design and acquisition of a task-oriented spontaneous-speech database. In: Roberto, V. (eds) Intelligent Perceptual Systems. Lecture Notes in Computer Science, vol 745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57379-8_12
Download citation
DOI: https://doi.org/10.1007/3-540-57379-8_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57379-1
Online ISBN: 978-3-540-48103-4
eBook Packages: Springer Book Archive