Audio Search Techniques

Mary, Leena; G, Deekshitha

doi:10.1007/978-3-319-97761-4_1

Audio Search Techniques

Leena Mary⁴ &
Deekshitha G⁵

Chapter
First Online: 26 September 2018

597 Accesses

Part of the book series: SpringerBriefs in Speech Technology ((BRIEFSSPEECHTECH))

Abstract

With the advancement of communication technologies and Internet, the use of multimedia has increased exponentially. Most of the population switched to smart phones and are always using multimedia data. A large amount of multimedia data resources are freely available nowadays. But, any resource will become useful, only if it can be handled efficiently. A huge amount of audio archives are available in different websites. Different audio archives include music databases, news bulletins, story databases, extempores, audio lectures, and audio interviews. Such audio resources can be efficiently used only if it is possible to efficiently retrieve the exact file from huge archives. Audio search refers to search and retrieval of a particular audio file from an audio database with the help of appropriate query. In this chapter, the task of audio search will be introduced, followed by the details of its classifications, applications, evolution, major milestones, different databases, and different benchmarking platforms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Wang LC (2003) An industrial-strength audio search algorithm. In: Proceedings of the 4th international conference on music information retrieval, pp 7–13
Google Scholar
Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3(4):27–36
Article Google Scholar
Salehinejad H, Barfett J, Aarabi P, Valaee S, Colak E, Gray B, Dowdell T (2017) A convolutional neural network for search term detection. In: Proceedings of IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC), Montreal, QC, pp 1–6
Google Scholar
Guo G, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1):209–215
Article Google Scholar
Barrington L, Chan A, Turnbull D, Lanckriet G (2007) Audio information retrieval using semantic similarity. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), vol 2, pp 725–728
Google Scholar
https://catalog.ldc.upenn.edu/LDC93S1
https://catalog.ldc.upenn.edu/LDC2011S02
http://www.ldcil.org/resourcesSpeechCorp.aspx
http://ice-corpora.net/ice/
http://aflat.org/ or http://www.meraka.org.za/lwazi
http://catalog.elra.info/index.php?cPath=37
https://catalog.ldc.upenn.edu/ldc96s35
https://catalog.ldc.upenn.edu/ldc93s6a
https://catalog.ldc.upenn.edu/ldc94s13a
https://catalog.ldc.upenn.edu/LDC2004S13
https://ocw.mit.edu/courses/audio-video-courses/
https://www.nist.gov/itl/iad/mig/open-keyword-search-evaluation
http://www.multimediaeval.org/about/
http://www.multimediaeval.org/mediaeval2011/SWS2011/
Metze F, Rajput N, Anguera X, et al (2012) The spoken web search task at Mediaeval 2011. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5165–5168
Google Scholar
Metze F, Anguera X, Barnard E, et al (2013) The spoken web search task at Mediaeval 2012. In: ICASSP, pp 8121–8125
Google Scholar
http://www.multimediaeval.org/mediaeval2014/quesst2014/
https://iberspeech2016.inesc-id.pt/index.php/albayzin-evaluation/
https://www.iarpa.gov/index.php/research-programs/babel/
Alumäe T, Karakos D, Hartmann W, Hsiao R, Zhang L, Nguyen L, Tsakalidis S, Schwartz, R (2017) The 2016 BBN Georgian telephone speech keyword spotting system. In: ICASSP, pp 5755–5759
Google Scholar
Bridle JS (1973) An efficient elastic-template method for detecting given words in running speech. In: British acoustical society spring meeting, pp 1–4
Google Scholar
Audhkhasi K, Verma A (2007) Keyword spotting using modified minimum edit distance measure. In: Proceedings of ICASSP, vol 4, pp 929–932
Google Scholar
Wallance R, Vogt R, Sridharan S (2007) A phonetic search approach to the 2006 NIST spoken term detection evaluation. In: INTERSPEECH, pp 2385–2388
Google Scholar
Vergyri D, Shafran I, Stolcke A, Gadde RR, Akbacak M, Roark B, Wang W (2007) The SRI/OGI 2006 spoken term detection system. In: Proceedings of INTERSPEECH, pp 2393–2396
Google Scholar
Chen G, Parada C, Heigold G (2014) Small-footprint keyword spotting using deep neural networks. In: Proceedings of ICASSP, pp 4087–4091
Google Scholar
Zhang Y (2013) Unsupervised speech processing with applications to query-by-example spoken term detection, Ph.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
Google Scholar
Zhang Y, Glass JR (2009) Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams. In: Proceedings of ASRU, pp 398–403
Google Scholar
Wang H, Lee T, Leung C (2011) Unsupervised spoken term detection with acoustic segment model. In: International conference on speech database and assessments (Oriental COCOSDA), pp 106–111
Google Scholar
Dumpala SH, Raju Alluri KNRK, Suryakanth VG, Uppala AKV (2015) Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection. In: Annual IEEE India conference (INDICON), New Delhi, pp 1–6
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics & Communication Engineering, Government Engineering College, Idukki, Kerala, India
Leena Mary
Department of Electronics and Communication Engineering, Rajiv Gandhi Institute of Technology, Kottayam, Kerala, India
Deekshitha G

Authors

Leena Mary
View author publications
You can also search for this author in PubMed Google Scholar
Deekshitha G
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mary, L., G, D. (2019). Audio Search Techniques. In: Searching Speech Databases. SpringerBriefs in Speech Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-97761-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-97761-4_1
Published: 26 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97760-7
Online ISBN: 978-3-319-97761-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics