Abstract
This paper describes the integration of speech recognizer into information retrieval (IR) system to retrieve text documents relevant to the given spoken queries. Our aim is to improve the speech recognizer since it has been proven as crucial for the front end of a Spoken Query IR system. When speech is used as the source material for indexing and retrieval, the effect of transcriber error on retrieval performance effectiveness must be considered. Thus, we proposed a dynamic weights connection strategy of artificial intelligence (AI) learning algorithms that combined genetic algorithms (GA) and neural network (NN) methods to improve the speech recognizer. Both algorithms are separate modules and were used to find the optimum weights for the hidden and output layers of a feed-forward artificial neural network (ANN) model. A mutated GA technique was proposed and compared with the standard GA technique. One hundred experiments using 50 selected words from spontaneous speeches were conducted. For evaluating speech recognition performance, we used the standard word error rate (WER) and for evaluating retrieval performance, we utilized precision and recall with respect to manual transcriptions. The proposed method yielded 95.39% recognition performance of spoken query input reducing the error rate to 4.61%. As for retrieval performance, our mutated GA+ANN model achieved a commendable 91% precision rate and 83% recall rate. It is interesting to note that the degradation in precision-recall is the same as the degradation in recognition performance of speech recognition engine. Owing to this fact, GA combined with ANN proved to attain certain advantages with sufficient accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vesnicer, B., Zibert, J., Dobrisek, S., Pavesic, N., Mihelic, F.: A Voice-driven Web Browser for Blind People. In: Eurospeech (2003)
González-Ferreras, C., Cadeñoso Payo, V.: Development and Evaluation of a Spoken Dialog System to Access a Newspaper Web Site. In: Eurospeech (2005)
Garofolo, J.S., Auzanne, C.G.P., Voorhees, E.M.: The TREC Spoken Document Retrieval Track: A Success Story. TREC-8 (1999)
Garofolo, J.S., Voorhees, E.M., Stanford, V.M., Jones, K.S.: TREC-6 1997 spoken document retrieval track overview and results. In: Proceedings of the 6th Text REtrieval Conference (1997)
Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., Kuo, S.W.: Experiments in spoken queries for document retrieval. In: Proceedings of Eurospeech (1997)
Crestani, F.: Word recognition errors and relevance feedback in spoken query processing. In: Proceedings of the Fourth International Conference on Flexible Query Answering Systems (2000)
Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustic, Speech and Signal Processing 1975 23(1), 67–72 (1975)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustic, Speech and Signal Processing 26(1), 43–49 (1978)
Panayiota, P., Costa, N., Costantinos, S.P.: Classification capacity of a modular neural network implementing neurally inspired architecture and training rules. IEEE Transactions on Neural Networks 15(3), 597–612 (2004)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representation by error propagation. In: Parallel Distributed Processing, Exploring the Macro Structure of Cognition. MIT Press, Cambridge (1986)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2001)
Goldberg, D.E.: Genetic Algorithm in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)
Britannica, Encyclopedia Britannica Online (2007), http://www.britannica.com/eb/article-9050292
Seman, N., Abu Bakar, Z., Abu Bakar, N.: An Evaluation of Endpoint Detection Measures for Malay Speech Recognition of an Isolated Words. In: Proceedings of the 4th International Symposium on Information Technology (ITSim 2010), pp. 1628–1635 (2010)
Seman, N.: Coalition of Genetic Algorithms and Artificial Neural Network for Isolated Spoken Malay, PhD. Thesis, Universiti Teknologi MARA (UiTM) (2012)
Hornik, K.J., Stinchcombe, D., White, H.: Multilayer Feedforward Networks are Universal Approximators. Neural Networks 2(5), 359–366 (1989)
Ghosh, R., Yearwood, J., Ghosh, M., Bagirov, A.: Hybridization of neural learning algorithms using evolutionary and discrete gradient approaches. Computer Science Journal 1(3), 387–394 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seman, N., Abu Bakar, Z., Jamil, N. (2013). Improving Speech Recognizer Using Neuro-genetic Weights Connection Strategy for Spoken Query Information Retrieval. In: Banchs, R.E., Silvestri, F., Liu, TY., Zhang, M., Gao, S., Lang, J. (eds) Information Retrieval Technology. AIRS 2013. Lecture Notes in Computer Science, vol 8281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45068-6_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-45068-6_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45067-9
Online ISBN: 978-3-642-45068-6
eBook Packages: Computer ScienceComputer Science (R0)