Performance Evaluation of a Speech Interface for Motorcycle Environment

Mporas, Iosif; Ganchev, Todor; Kocsis, Otilia; Fakotakis, Nikos

doi:10.1007/978-1-4419-0221-4_31

Iosif Mporas⁶,
Todor Ganchev⁶,
Otilia Kocsis⁶ &
…
Nikos Fakotakis⁶

Part of the book series: IFIP International Federation for Information Processing ((IFIPAICT,volume 296))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

1425 Accesses

Abstract

In the present work we investigate the performance of a number of traditional and recent speech enhancement algorithms in the adverse non-stationary conditions, which are distinctive for motorcycle on the move. The performance of these algorithms is ranked in terms of the improvement they contribute to the speech recognition rate, when compared to the baseline result, i.e. without speech enhancement. The experimentations on the MoveOn motorcycle speech and noise database suggested that there is no equivalence between the ranking of algorithms based on the human perception of speech quality and the speech recognition performance. The Multi-band spectral subtraction method was observed to lead to the highest speech recognition performance.

Download to read the full chapter text

Chapter PDF

Improvement of speech signal extraction method using detection filter of energy spectrum entropy

Article 06 February 2015

Kyungyong Chung & SangYeob Oh

Spectral difference for statistical model-based speech enhancement in speech recognition

Article 18 November 2016

Soojeong Lee & Joon-Hyuk Chang

Methods to Improve the Efficiency of Recognition of Speech Signals in Voice Control Systems

Article 25 December 2015

A. K. Alimuradov & F. Sh. Murtazov

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Gartner, U., Konig, W., Wittig, T. (2001). Evaluation of Manual vs. Speech input when using a driver information system in real traffic. Driving Assessment 2001: 1st International Driving Symposium on Human Factors in Driver Assessment, Training and Ve-chicle Design, pp. 7–13, CO.
Google Scholar
Berton, A., Buhler, D., Minker, W. (2006). SmartKom-Mobile Car: User Interaction with Mobile Services in a Car Environment. In SmartKom: Foundations of Multimodal Dialogue Systems, Wolfgang Wahlster (Ed.). pp. 523–537, Springer.
Google Scholar
Bohus, D., Rudnicky, A.I. (2003). RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda. Proceedings European Conference on Speech Communication and Technology (EUROSPEECH):597–600.
Google Scholar
Bohus, D., Raux, A., Harris, T.K., Eskenazi, M., Rudnicky, A.I. (2007). Olympus: an open-source framework for conversational spoken Language interface research, Bridging the Gap: Academic and Industrial Research in Dialog Technology workshop at HLT/NAACL 2007.
Google Scholar
Berouti, M., Schwartz, R., Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceedings IEEE ICASSP′79:208–211.
Google Scholar
Martin, R. (2001). Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Transactions on Speech and Audio Processing 9(5):504–512.
Article Google Scholar
Kamath, S., Loizou, P. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. Proceedings ICASSP—02.
Google Scholar
Ephraim, Y., Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, Signal Processing 33:443–445.
Article Google Scholar
Loizou, P. (2005). Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Transactions on Speech and Audio Processing 13(5):857–869.
Article Google Scholar
Hu,Y., Loizou, P. (2003). A generalized subspace approach for enhancing speech corrupted by coloured noise. IEEE Transactions on Speech and Audio Processing 11:334–341.
Article Google Scholar
Jabloun, F., Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing 11(6):700–708.
Article Google Scholar
Hu, Y., Loizou, P. (2004). Speech enhancement based on wavelet thresholding the multi-taper spectrum. IEEE Transactions on Speech and Audio Processing 12(1):59–67.
Article Google Scholar
Winkler, T., Kostoulas, T., Adderley, R., Bonkowski, C., Ganchev, T., Kohler, J., Fako-takis N. (2008). The MoveOn Motorcycle Speech Corpus. Proceedings of LREC′2008.
Google Scholar
Lee, A., Kawahara, T., Shikano, K. (2001). Julius an open source real-time large vocabulary recognition engine. Proceedings European Conference on Speech Communication and Technology (EUROSPEECH):1691–1694.
Google Scholar
Hoge, H., Draxler, C., Van den Heuvel, H., Johansen, F.T., Sanders, E., Tropf, H.S. (1999). SpeechDat Multilingual Speech Databases for Teleservices: Across the Finish Line. Proceedings 6th European Conference on Speech Communication and Technology (EUROSPEECH):2699–2702.
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ol-lason, D., Povey, D., Valtchev, V., Woodland, P. (2005). The HTK Book (for HTK Version 3.3). Cambridge University.
Google Scholar
Davis, S.B., Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing 28(4):357–366.
Article Google Scholar
Baum, L.E., Petrie, T., Soules, G., Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics 41(1):164–171.
Article MathSciNet MATH Google Scholar
Clarkson, P.R., Rosenfeld, R. (1997). Statistical Language Modeling Using the CMU-Cambridge Toolkit. Proceedings 5th European Conference on Speech Communication and Technology (EUROSPEECH): 2707–2710.
Google Scholar
Winkler, T., Ganchev, T., Kostoulas,T., Mporas, I., Lazaridis, A., Ntalampiras, S., Badii, A., Adderley, R., Bonkowski, C. (2007). MoveOn Deliverable D.5: Report on Audio databases, Noise processing environment, ASR and TTS modules.
Google Scholar
Ntalampiras, S., Ganchev, T., Potamitis, I., Fakotakis, N. (2008). Objective comparison of speech enhancement algorithms under real world conditions. Proceedings PETRA 2008:34.
Google Scholar
Loizou P. (2007). Speech Enhancement: Theory and Practice, CRC Press, 2007.
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Group, Wire Communications Laboratory, Dept. of Electrical and Computer Engineering, University of Patras, Rion, 26500, Greece
Iosif Mporas, Todor Ganchev, Otilia Kocsis & Nikos Fakotakis

Authors

Iosif Mporas
View author publications
You can also search for this author in PubMed Google Scholar
Todor Ganchev
View author publications
You can also search for this author in PubMed Google Scholar
Otilia Kocsis
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Fakotakis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Democritus University of Thrace, Greece
Iliadis
Aristotle University of Thessaloniki, Greece
Vlahavas
University of Portsmouth, United Kingdom
Bramer
University of Central, Greece
Maglogiann
Aristotle University of Thessaloniki, Greece
Tsoumakasis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mporas, I., Ganchev, T., Kocsis, O., Fakotakis, N. (2009). Performance Evaluation of a Speech Interface for Motorcycle Environment. In: Iliadis, Maglogiann, Tsoumakasis, Vlahavas, Bramer (eds) Artificial Intelligence Applications and Innovations III. AIAI 2009. IFIP International Federation for Information Processing, vol 296. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-0221-4_31

Download citation

DOI: https://doi.org/10.1007/978-1-4419-0221-4_31
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-0220-7
Online ISBN: 978-1-4419-0221-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Performance Evaluation of a Speech Interface for Motorcycle Environment

Abstract

Chapter PDF

Similar content being viewed by others

Improvement of speech signal extraction method using detection filter of energy spectrum entropy

Spectral difference for statistical model-based speech enhancement in speech recognition

Methods to Improve the Efficiency of Recognition of Speech Signals in Voice Control Systems

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Performance Evaluation of a Speech Interface for Motorcycle Environment

Abstract

Chapter PDF

Similar content being viewed by others

Improvement of speech signal extraction method using detection filter of energy spectrum entropy

Spectral difference for statistical model-based speech enhancement in speech recognition

Methods to Improve the Efficiency of Recognition of Speech Signals in Voice Control Systems

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation