Skip to main content

Audio Search Techniques

  • Chapter
  • First Online:
  • 597 Accesses

Part of the book series: SpringerBriefs in Speech Technology ((BRIEFSSPEECHTECH))

Abstract

With the advancement of communication technologies and Internet, the use of multimedia has increased exponentially. Most of the population switched to smart phones and are always using multimedia data. A large amount of multimedia data resources are freely available nowadays. But, any resource will become useful, only if it can be handled efficiently. A huge amount of audio archives are available in different websites. Different audio archives include music databases, news bulletins, story databases, extempores, audio lectures, and audio interviews. Such audio resources can be efficiently used only if it is possible to efficiently retrieve the exact file from huge archives. Audio search refers to search and retrieval of a particular audio file from an audio database with the help of appropriate query. In this chapter, the task of audio search will be introduced, followed by the details of its classifications, applications, evolution, major milestones, different databases, and different benchmarking platforms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Wang LC (2003) An industrial-strength audio search algorithm. In: Proceedings of the 4th international conference on music information retrieval, pp 7–13

    Google Scholar 

  2. Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3(4):27–36

    Article  Google Scholar 

  3. Salehinejad H, Barfett J, Aarabi P, Valaee S, Colak E, Gray B, Dowdell T (2017) A convolutional neural network for search term detection. In: Proceedings of IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC), Montreal, QC, pp 1–6

    Google Scholar 

  4. Guo G, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1):209–215

    Article  Google Scholar 

  5. Barrington L, Chan A, Turnbull D, Lanckriet G (2007) Audio information retrieval using semantic similarity. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), vol 2, pp 725–728

    Google Scholar 

  6. https://catalog.ldc.upenn.edu/LDC93S1

  7. https://catalog.ldc.upenn.edu/LDC2011S02

  8. http://www.ldcil.org/resourcesSpeechCorp.aspx

  9. http://ice-corpora.net/ice/

  10. http://aflat.org/ or http://www.meraka.org.za/lwazi

  11. http://catalog.elra.info/index.php?cPath=37

  12. https://catalog.ldc.upenn.edu/ldc96s35

  13. https://catalog.ldc.upenn.edu/ldc93s6a

  14. https://catalog.ldc.upenn.edu/ldc94s13a

  15. https://catalog.ldc.upenn.edu/LDC2004S13

  16. https://ocw.mit.edu/courses/audio-video-courses/

  17. https://www.nist.gov/itl/iad/mig/open-keyword-search-evaluation

  18. http://www.multimediaeval.org/about/

  19. http://www.multimediaeval.org/mediaeval2011/SWS2011/

  20. Metze F, Rajput N, Anguera X, et al (2012) The spoken web search task at Mediaeval 2011. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5165–5168

    Google Scholar 

  21. Metze F, Anguera X, Barnard E, et al (2013) The spoken web search task at Mediaeval 2012. In: ICASSP, pp 8121–8125

    Google Scholar 

  22. http://www.multimediaeval.org/mediaeval2014/quesst2014/

  23. https://iberspeech2016.inesc-id.pt/index.php/albayzin-evaluation/

  24. https://www.iarpa.gov/index.php/research-programs/babel/

  25. Alumäe T, Karakos D, Hartmann W, Hsiao R, Zhang L, Nguyen L, Tsakalidis S, Schwartz, R (2017) The 2016 BBN Georgian telephone speech keyword spotting system. In: ICASSP, pp 5755–5759

    Google Scholar 

  26. Bridle JS (1973) An efficient elastic-template method for detecting given words in running speech. In: British acoustical society spring meeting, pp 1–4

    Google Scholar 

  27. Audhkhasi K, Verma A (2007) Keyword spotting using modified minimum edit distance measure. In: Proceedings of ICASSP, vol 4, pp 929–932

    Google Scholar 

  28. Wallance R, Vogt R, Sridharan S (2007) A phonetic search approach to the 2006 NIST spoken term detection evaluation. In: INTERSPEECH, pp 2385–2388

    Google Scholar 

  29. Vergyri D, Shafran I, Stolcke A, Gadde RR, Akbacak M, Roark B, Wang W (2007) The SRI/OGI 2006 spoken term detection system. In: Proceedings of INTERSPEECH, pp 2393–2396

    Google Scholar 

  30. Chen G, Parada C, Heigold G (2014) Small-footprint keyword spotting using deep neural networks. In: Proceedings of ICASSP, pp 4087–4091

    Google Scholar 

  31. Zhang Y (2013) Unsupervised speech processing with applications to query-by-example spoken term detection, Ph.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology

    Google Scholar 

  32. Zhang Y, Glass JR (2009) Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams. In: Proceedings of ASRU, pp 398–403

    Google Scholar 

  33. Wang H, Lee T, Leung C (2011) Unsupervised spoken term detection with acoustic segment model. In: International conference on speech database and assessments (Oriental COCOSDA), pp 106–111

    Google Scholar 

  34. Dumpala SH, Raju Alluri KNRK, Suryakanth VG, Uppala AKV (2015) Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection. In: Annual IEEE India conference (INDICON), New Delhi, pp 1–6

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s), under exclusive licence to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mary, L., G, D. (2019). Audio Search Techniques. In: Searching Speech Databases. SpringerBriefs in Speech Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-97761-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-97761-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-97760-7

  • Online ISBN: 978-3-319-97761-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics