Skip to main content

Indexing and Retrieval of Speech Documents

  • Conference paper
Advanced Computing, Networking and Informatics- Volume 1

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 27))

  • 1965 Accesses

Abstract

In this paper, a speech document indexing system and similarity-based document retrieval method has been proposed. K-d tree is used as the index structure and codebooks derived from speech documents present in the database, are used during retrieval of desired document. Each document is represented as a sequence of codebook indices. The longest common subsequence based approach is proposed for retrieving the documents. Proposed retrieval method is evaluated using a speech database of 3 hours recorded by a male speaker and speech queries from 5 male and 5 female speakers. The accuracy of retrieval is found to be about 88% for the queries given by male speakers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cha, G.-H.: An Effective and Efficient Indexing Scheme for Audio Fingerprinting. In: Proceedings of the 2011 Fifth FTRA International Conference on Multimedia and Ubiquitous Engineering, Washington, DC, USA, pp. 48–52 (2011)

    Google Scholar 

  2. Chen, A.L.P., Chang, M., Chen, J., Hsu, J.-L., Hsu, C.-H., Hua, S.Y.S.: Query by music segments: an efficient approach for song retrieval. In: 2000 IEEE International Conference on Multimedia and Expo (2000)

    Google Scholar 

  3. Foote, J.T.: Content-Based Retrieval of Music and Audio. In: Proceedings of SPIE, Multimedia Storage and Archiving Systems II, pp. 138–147 (1997)

    Google Scholar 

  4. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)

    Article  MATH  Google Scholar 

  5. Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. ACM Communucations, 341–343 (1975)

    Google Scholar 

  6. Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signal. IEEE Press (2000)

    Google Scholar 

  7. Kosugi, N., Nishihara, Y., Sakata, T., Yamamuro, M., Kushima, K.: A practical query-by-humming system for a large music database. In: Proceedings of the Eighth ACM International Conference on Multimedia, pp. 333–342 (2000)

    Google Scholar 

  8. Lemström, K., Laine, P.: Musical information retrieval using musical parameters. In: Proceedings of the 1998 International Computer Music Conference (1998)

    Google Scholar 

  9. Li, G., Khokhar, A.A.: Content-based indexing and retrieval of audio data using wavelets. In: 2000 IEEE International Conference on Multimedia & Expo, pp. 885–888 (2000)

    Google Scholar 

  10. Lu, L., You, H., Zhang, H.-J.: A new approach to query by humming in music retrieval. In: ICME 2001, pp. 595–598 (2001)

    Google Scholar 

  11. Maier, D.: The Complexity of Some Problems on Subsequences and Supersequences. J. ACM, 322–336 (1978)

    Google Scholar 

  12. Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Inc. (1993)

    Google Scholar 

  13. Rao, K.S., Pachpande, K., Vempada, R.R., Maity, S.: Segmentation of TV broadcast news using speaker specific information. In: NCC 2012, pp. 1–5 (2012)

    Google Scholar 

  14. Subramanya, S.R., Youssef, A.: Wavelet-based Indexing of Audio Data in Audio/Multimedia Databases. In: Proceedings of MultiMedia Database Management Systems (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piyush Kumar P. Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Singh, P.K.P., Manjunath, K.E., Ravi Kiran, R., Yadav, J., Sreenivasa Rao, K. (2014). Indexing and Retrieval of Speech Documents. In: Kumar Kundu, M., Mohapatra, D., Konar, A., Chakraborty, A. (eds) Advanced Computing, Networking and Informatics- Volume 1. Smart Innovation, Systems and Technologies, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-319-07353-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07353-8_3

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07352-1

  • Online ISBN: 978-3-319-07353-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics