Abstract
In this paper, a speech document indexing system and similarity-based document retrieval method has been proposed. K-d tree is used as the index structure and codebooks derived from speech documents present in the database, are used during retrieval of desired document. Each document is represented as a sequence of codebook indices. The longest common subsequence based approach is proposed for retrieving the documents. Proposed retrieval method is evaluated using a speech database of 3 hours recorded by a male speaker and speech queries from 5 male and 5 female speakers. The accuracy of retrieval is found to be about 88% for the queries given by male speakers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cha, G.-H.: An Effective and Efficient Indexing Scheme for Audio Fingerprinting. In: Proceedings of the 2011 Fifth FTRA International Conference on Multimedia and Ubiquitous Engineering, Washington, DC, USA, pp. 48–52 (2011)
Chen, A.L.P., Chang, M., Chen, J., Hsu, J.-L., Hsu, C.-H., Hua, S.Y.S.: Query by music segments: an efficient approach for song retrieval. In: 2000 IEEE International Conference on Multimedia and Expo (2000)
Foote, J.T.: Content-Based Retrieval of Music and Audio. In: Proceedings of SPIE, Multimedia Storage and Archiving Systems II, pp. 138–147 (1997)
Friedman, J.H., Bentley, J.L., Finkel, R.A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. ACM Communucations, 341–343 (1975)
Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signal. IEEE Press (2000)
Kosugi, N., Nishihara, Y., Sakata, T., Yamamuro, M., Kushima, K.: A practical query-by-humming system for a large music database. In: Proceedings of the Eighth ACM International Conference on Multimedia, pp. 333–342 (2000)
Lemström, K., Laine, P.: Musical information retrieval using musical parameters. In: Proceedings of the 1998 International Computer Music Conference (1998)
Li, G., Khokhar, A.A.: Content-based indexing and retrieval of audio data using wavelets. In: 2000 IEEE International Conference on Multimedia & Expo, pp. 885–888 (2000)
Lu, L., You, H., Zhang, H.-J.: A new approach to query by humming in music retrieval. In: ICME 2001, pp. 595–598 (2001)
Maier, D.: The Complexity of Some Problems on Subsequences and Supersequences. J. ACM, 322–336 (1978)
Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Inc. (1993)
Rao, K.S., Pachpande, K., Vempada, R.R., Maity, S.: Segmentation of TV broadcast news using speaker specific information. In: NCC 2012, pp. 1–5 (2012)
Subramanya, S.R., Youssef, A.: Wavelet-based Indexing of Audio Data in Audio/Multimedia Databases. In: Proceedings of MultiMedia Database Management Systems (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Singh, P.K.P., Manjunath, K.E., Ravi Kiran, R., Yadav, J., Sreenivasa Rao, K. (2014). Indexing and Retrieval of Speech Documents. In: Kumar Kundu, M., Mohapatra, D., Konar, A., Chakraborty, A. (eds) Advanced Computing, Networking and Informatics- Volume 1. Smart Innovation, Systems and Technologies, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-319-07353-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-07353-8_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07352-1
Online ISBN: 978-3-319-07353-8
eBook Packages: EngineeringEngineering (R0)