Abstract
With the advancement of communication technologies and Internet, the use of multimedia has increased exponentially. Most of the population switched to smart phones and are always using multimedia data. A large amount of multimedia data resources are freely available nowadays. But, any resource will become useful, only if it can be handled efficiently. A huge amount of audio archives are available in different websites. Different audio archives include music databases, news bulletins, story databases, extempores, audio lectures, and audio interviews. Such audio resources can be efficiently used only if it is possible to efficiently retrieve the exact file from huge archives. Audio search refers to search and retrieval of a particular audio file from an audio database with the help of appropriate query. In this chapter, the task of audio search will be introduced, followed by the details of its classifications, applications, evolution, major milestones, different databases, and different benchmarking platforms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wang LC (2003) An industrial-strength audio search algorithm. In: Proceedings of the 4th international conference on music information retrieval, pp 7–13
Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3(4):27–36
Salehinejad H, Barfett J, Aarabi P, Valaee S, Colak E, Gray B, Dowdell T (2017) A convolutional neural network for search term detection. In: Proceedings of IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC), Montreal, QC, pp 1–6
Guo G, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1):209–215
Barrington L, Chan A, Turnbull D, Lanckriet G (2007) Audio information retrieval using semantic similarity. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), vol 2, pp 725–728
https://www.nist.gov/itl/iad/mig/open-keyword-search-evaluation
Metze F, Rajput N, Anguera X, et al (2012) The spoken web search task at Mediaeval 2011. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5165–5168
Metze F, Anguera X, Barnard E, et al (2013) The spoken web search task at Mediaeval 2012. In: ICASSP, pp 8121–8125
https://iberspeech2016.inesc-id.pt/index.php/albayzin-evaluation/
Alumäe T, Karakos D, Hartmann W, Hsiao R, Zhang L, Nguyen L, Tsakalidis S, Schwartz, R (2017) The 2016 BBN Georgian telephone speech keyword spotting system. In: ICASSP, pp 5755–5759
Bridle JS (1973) An efficient elastic-template method for detecting given words in running speech. In: British acoustical society spring meeting, pp 1–4
Audhkhasi K, Verma A (2007) Keyword spotting using modified minimum edit distance measure. In: Proceedings of ICASSP, vol 4, pp 929–932
Wallance R, Vogt R, Sridharan S (2007) A phonetic search approach to the 2006 NIST spoken term detection evaluation. In: INTERSPEECH, pp 2385–2388
Vergyri D, Shafran I, Stolcke A, Gadde RR, Akbacak M, Roark B, Wang W (2007) The SRI/OGI 2006 spoken term detection system. In: Proceedings of INTERSPEECH, pp 2393–2396
Chen G, Parada C, Heigold G (2014) Small-footprint keyword spotting using deep neural networks. In: Proceedings of ICASSP, pp 4087–4091
Zhang Y (2013) Unsupervised speech processing with applications to query-by-example spoken term detection, Ph.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
Zhang Y, Glass JR (2009) Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams. In: Proceedings of ASRU, pp 398–403
Wang H, Lee T, Leung C (2011) Unsupervised spoken term detection with acoustic segment model. In: International conference on speech database and assessments (Oriental COCOSDA), pp 106–111
Dumpala SH, Raju Alluri KNRK, Suryakanth VG, Uppala AKV (2015) Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection. In: Annual IEEE India conference (INDICON), New Delhi, pp 1–6
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 The Author(s), under exclusive licence to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mary, L., G, D. (2019). Audio Search Techniques. In: Searching Speech Databases. SpringerBriefs in Speech Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-97761-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-97761-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97760-7
Online ISBN: 978-3-319-97761-4
eBook Packages: EngineeringEngineering (R0)