Indexing and Retrieval of Speech Documents

Singh, Piyush Kumar P.; Manjunath, K. E.; Ravi Kiran, R.; Yadav, Jainath; Sreenivasa Rao, K.

doi:10.1007/978-3-319-07353-8_3

Piyush Kumar P. Singh⁷,
K. E. Manjunath⁷,
R. Ravi Kiran⁷,
Jainath Yadav⁷ &
…
K. Sreenivasa Rao⁷

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 27))

1965 Accesses

Abstract

In this paper, a speech document indexing system and similarity-based document retrieval method has been proposed. K-d tree is used as the index structure and codebooks derived from speech documents present in the database, are used during retrieval of desired document. Each document is represented as a sequence of codebook indices. The longest common subsequence based approach is proposed for retrieving the documents. Proposed retrieval method is evaluated using a speech database of 3 hours recorded by a male speaker and speech queries from 5 male and 5 female speakers. The accuracy of retrieval is found to be about 88% for the queries given by male speakers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cha, G.-H.: An Effective and Efficient Indexing Scheme for Audio Fingerprinting. In: Proceedings of the 2011 Fifth FTRA International Conference on Multimedia and Ubiquitous Engineering, Washington, DC, USA, pp. 48–52 (2011)
Google Scholar
Chen, A.L.P., Chang, M., Chen, J., Hsu, J.-L., Hsu, C.-H., Hua, S.Y.S.: Query by music segments: an efficient approach for song retrieval. In: 2000 IEEE International Conference on Multimedia and Expo (2000)
Google Scholar
Foote, J.T.: Content-Based Retrieval of Music and Audio. In: Proceedings of SPIE, Multimedia Storage and Archiving Systems II, pp. 138–147 (1997)
Google Scholar
Friedman, J.H., Bentley, J.L., Finkel, R.A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)
Article MATH Google Scholar
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. ACM Communucations, 341–343 (1975)
Google Scholar
Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signal. IEEE Press (2000)
Google Scholar
Kosugi, N., Nishihara, Y., Sakata, T., Yamamuro, M., Kushima, K.: A practical query-by-humming system for a large music database. In: Proceedings of the Eighth ACM International Conference on Multimedia, pp. 333–342 (2000)
Google Scholar
Lemström, K., Laine, P.: Musical information retrieval using musical parameters. In: Proceedings of the 1998 International Computer Music Conference (1998)
Google Scholar
Li, G., Khokhar, A.A.: Content-based indexing and retrieval of audio data using wavelets. In: 2000 IEEE International Conference on Multimedia & Expo, pp. 885–888 (2000)
Google Scholar
Lu, L., You, H., Zhang, H.-J.: A new approach to query by humming in music retrieval. In: ICME 2001, pp. 595–598 (2001)
Google Scholar
Maier, D.: The Complexity of Some Problems on Subsequences and Supersequences. J. ACM, 322–336 (1978)
Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Inc. (1993)
Google Scholar
Rao, K.S., Pachpande, K., Vempada, R.R., Maity, S.: Segmentation of TV broadcast news using speaker specific information. In: NCC 2012, pp. 1–5 (2012)
Google Scholar
Subramanya, S.R., Youssef, A.: Wavelet-based Indexing of Audio Data in Audio/Multimedia Databases. In: Proceedings of MultiMedia Database Management Systems (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, 721302, India
Piyush Kumar P. Singh, K. E. Manjunath, R. Ravi Kiran, Jainath Yadav & K. Sreenivasa Rao

Authors

Piyush Kumar P. Singh
View author publications
You can also search for this author in PubMed Google Scholar
K. E. Manjunath
View author publications
You can also search for this author in PubMed Google Scholar
R. Ravi Kiran
View author publications
You can also search for this author in PubMed Google Scholar
Jainath Yadav
View author publications
You can also search for this author in PubMed Google Scholar
K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piyush Kumar P. Singh .

Editor information

Editors and Affiliations

Indian Statistical Institute, Machine Intelligence Unit, Kolkata, India
Malay Kumar Kundu
Dept. of Computer Science and Engineering, National Institute of Technology Rourkela, Rourkela, India
Durga Prasad Mohapatra
Dept. of Electronics and Tele-Communication Engineering, Jadavpur University Artificial Intelligence Laboratory, Kolkata, India
Amit Konar
Dept. of Computer Science and Engineering, St. Thomas' College of Engineering & Technology, Kidderpore, West Bengal, India
Aruna Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, P.K.P., Manjunath, K.E., Ravi Kiran, R., Yadav, J., Sreenivasa Rao, K. (2014). Indexing and Retrieval of Speech Documents. In: Kumar Kundu, M., Mohapatra, D., Konar, A., Chakraborty, A. (eds) Advanced Computing, Networking and Informatics- Volume 1. Smart Innovation, Systems and Technologies, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-319-07353-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-07353-8_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07352-1
Online ISBN: 978-3-319-07353-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics