Arabic Multimedia Search Platform

Raad, Mohamad; Bayan, Majida; Dalloul, Yihya; Ghareeb, Majd; Haj-Ali, Amin

doi:10.1007/978-3-319-89914-5_14

Mohamad Raad⁴,
Majida Bayan⁴,
Yihya Dalloul⁴,
Majd Ghareeb⁴ &
…
Amin Haj-Ali⁴

382 Accesses

Abstract

The ability to locate relevant information in cyberspace has been the great value-add of modern search engines. The same approach has also been applied to files stored on local memory to great effect. The ever-increasing volume of information means that the need for effective search engines will only increase with time.

Most search engines focus on locating textual information stored in accessible file formats. The search through other modes of information representation, such as image and video, has also been gaining prominence recently with varying degrees of success. However, one format of information storage is still not prominently accessible through search engines, namely multimedia files that store audio and video content.

Examples of information stored in such files include lectures, news broadcasts and voice notes. The information contained within such files is of a type that is typically sought. For example, people who rely on voice notes would benefit from a platform that allowed them to search quickly through their notes and hyperlink to the relevant audio segments. Students searching for specific material in many video lectures would benefit from being able to quickly search, link to and watch the relevant segments. Similarly, researchers and reporters looking for references about specific topics in news broadcasts, for example, would benefit from such a search engine to quickly identify the relevant content to watch or listen to.

The usefulness of such a platform varies depending on the cultural context of the content being searched. Arabic cultures are highly oral, with much of the information sought after by researchers and students being contained in speeches, lectures and interviews. As such, the usefulness of a search engine as described above is quite high when it comes to organizing Arabic information.

This chapter describes the design, development and testing of a web-based platform that allows for searching through Arabic multimedia content. The platform allows for searching through a corpus of video or audio files for keywords and provides hyperlinks to the relevant audio segments where those keywords are mentioned. As such, the platform relies on an Arabic language transcriber. The approach taken in the development of this platform is that of system integration. Whilst this approach has the advantage of enabling the use of existing resources, it has highlighted the need for more accurate and more easily accessible Arabic language transcribers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Burgess and J. Green, “Uses of YouTube: Digital literacy and the growth of knowledge,” in YouTube: Online video and participatory cultureUK: Polity press, 2009, pp. 126–143.
Google Scholar
K. Schoeffmann and C. Cobârzan, “An evaluation of interactive search with modern video players,” in Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on, 2013, pp. 1–4: IEEE.
Google Scholar
K. Schoeffmann and F. Hopfgartner, “Interactive video search,” in Proceedings of the 23rd ACM international conference on Multimedia, 2015, pp. 1321–1322: ACM.
Google Scholar
T. Tommasi et al., “Beyond metadata: searching your archive based on its audio-visual content,” in IBC2014, Amsterdam, Netherlands, 2014, pp. 1–3.
Google Scholar
H. Yang and C. Meinel, “Content based lecture video retrieval using speech and video text information,” IEEE Transactions on Learning Technologies, vol. 7, no. 2, pp. 142-154, 2014.
Article Google Scholar
J. D. V. Miró, J. A. Silvestre-Cerdà, J. Civera, C. Turró, and A. Juan, “Efficiency and usability study of innovative computer-aided transcription strategies for video lecture repositories,” Speech Communication, vol. 74, pp. 65–75, 2015.
Article Google Scholar
M. Eskevich et al., “Multimedia information seeking through search and hyperlinking,” in Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, 2013, pp. 287–294: ACM.
Google Scholar
M. Eskevich, R. Aly, D. Racca, R. Ordelman, S. Chen, and G. J. Jones, “The search and hyperlinking task at MediaEval 2014,” 2014.
Google Scholar
R. Kubat, P. DeCamp, B. Roy, and D. Roy, “Totalrecall: visualization and semi-automatic annotation of very large audio-visual corpora,” in ICMI, 2007, vol. 7, pp. 208-215.
Google Scholar
S. Kairam, N. H. Riche, S. Drucker, R. Fernandez, and J. Heer, “Refinery: Visual exploration of large, heterogeneous networks through associative browsing,” in Computer Graphics Forum, 2015, vol. 34, no. 3, pp. 301–310: Wiley Online Library.
Google Scholar
G. Awad et al., “Trecvid 2016: Evaluating video search, video event detection, localization, and hyperlinking,” in Proceedings of TRECVID, 2016, vol. 2016.
Google Scholar
A. Chakraborty, P. Liu, and L. Hsu, “Method and apparatus for authoring and linking video documents,” USA Patent US6462754 B1, Oct 8, 2002.
Google Scholar
P. Galuščáková and P. Pecina, “Audio Information for Hyperlinking of TV Content,” in Proceedings of the Third Edition Workshop on Speech, Language & Audio in Multimedia, 2015, pp. 27-30: ACM.
Google Scholar
Z. Cheng, X. Li, J. Shen, and A. G. Hauptmann, “Which information sources are more effective and reliable in video search,” in Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016, pp. 1069–1072: ACM.
Google Scholar
R. J. Ordelman, M. Eskevich, R. Aly, B. Huet, and G. Jones, “Defining and evaluating video hyperlinking for navigating multimedia archives,” in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 727–732: ACM.
Google Scholar
M. Garnier-Rizet et al., “CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and Query by Content,” in LREC, 2008.
Google Scholar
C. Barras, A. Allauzen, L. Lamel, and J.-L. Gauvain, “Transcribing audio-video archives,” in Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on, 2002, vol. 1, pp. I-13–I-16: IEEE.
Google Scholar
C. Gaida, P. Lange, R. Petrick, P. Proba, A. Malatawy, and D. Suendermann-Oeft, “Comparing open-source speech recognition toolkits,” Tech. Rep., DHBW Stuttgart, 2014.
Google Scholar
P. Lamere et al., “The CMU SPHINX-4 speech recognition system,” in IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2003), Hong Kong, 2003, vol. 1, pp. 2–5.
Google Scholar
D. Povey et al., “The Kaldi Speech Recognition Toolkit,” presented at the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US, 2011.
Google Scholar
O. Mubin, C. Bartneck, L. Feijs, H. Hooft van Huysduynen, J. Hu, and J. Muelver, “Improving speech recognition with the robot interaction language,” Disruptive Science and Technology, vol. 1, no. 2, pp. 79–88, 2012.
Article Google Scholar
B. C. Roy and D. Roy, “Fast transcription of unstructured audio recordings,” presented at the The 10th Annual Conference of the International Speech Communication Association, INTERSPEECH, Brighton, UK, September, 2009, 2009. Available: http://hdl.handle.net/1721.1/67363
K. Almeman, M. Lee, and A. A. Almiman, “Multi dialect Arabic speech parallel corpora,” in Communications, Signal Processing, and their Applications (ICCSPA), 2013 1st International Conference on, 2013, pp. 1–6: IEEE.
Google Scholar
K. Almeman. (December 25). Arabic resources. Available: http://www.almeman.com/arabic-resources.html
N. Halabi. (2017, December 25). Arabic speech corpus. Available: http://en.arabicspeechcorpus.com/
A. Chotimongkol, K. Saykhum, P. Chootrakool, N. Thatphithakkul, and C. Wutiwiwatchai, “LOTUS-BN: A Thai broadcast news corpus and its research applications,” in Speech Database and Assessments, 2009 Oriental COCOSDA International Conference on, 2009, pp. 44–50: IEEE.
Google Scholar
R. Fromont and J. Hay, “ONZE Miner: the development of a browser-based research tool,” Corpora, vol. 3, no. 2, pp. 173–193, 2008.
Article Google Scholar
F. Torreira, M. Adda-Decker, and M. Ernestus, “The Nijmegen corpus of casual French,” Speech Communication, vol. 52, no. 3, pp. 201–212, 2010.
Article Google Scholar
A. Vandecatseye et al., “The COST278 Pan-European Broadcast News Database,” in LREC, 2004.
Google Scholar
H.-M. Wang, B. Chen, J.-W. Kuo, and S.-S. Cheng, “MATBN: A Mandarin Chinese broadcast news corpus,” International Journal of Computational Linguistics and Chinese Language Processing, vol. 10, no. 2, pp. 219–236, 2005.
Google Scholar
C. Barras, E. Geoffrois, Z. Wu, and M. Liberman, “Transcriber: development and use of a tool for assisting speech corpora production,” Speech Communication, vol. 33, no. 1, pp. 5–22, 2001.
Article Google Scholar
K. Boudahmane, M. Manta, F. Antoine, S. Galliano, and C. Barras. (2008, December, 25). Transcriber. Available: http://trans.sourceforge.net/en/presentation.php
DGA. (2014, December, 25). TranscriberAG. Available: http://transag.sourceforge.net/
B. Bigi, “SPPAS-multi-lingual approaches to the automatic annotation of speech,” Phonetician, vol. 2015-I-II, no. 111-112, pp. 54 - 69, 2015.
Google Scholar
J. Carletta, S. Evert, U. Heid, J. Kilgour, J. Robertson, and H. Voormann, “The NITE XML toolkit: flexible annotation for multimodal language data,” Behavior Research Methods, vol. 35, no. 3, pp. 353–363, 2003.
Article Google Scholar
M. N. Al Laham, I. Ayass, M. Ghareeb, Z. El-Bazzal, and M. Raad, “Audio indexing for YouTube,” in Digital Information and Communication Technology and its Applications (DICTAP), 2015 Fifth International Conference on, 2015, pp. 111–114: IEEE.
Google Scholar
W. Walker et al., “Sphinx-4: A Flexible Open Source Framework for Speech Recognition,” Sun Microsystems, Inc., Technical Report 2004.
Google Scholar
(2017). FFMPEG. Available: http://ffmpeg.org/
IBM. (2017). IBM Watson speech to text cloud service. Available: https://www.ibm.com/watson/services/speech-to-text/
J. S. Hare, S. Samangooei, and D. P. Dupplaw, “OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images,” presented at the Proceedings of the 19th ACM international conference on Multimedia, Scottsdale, Arizona, USA, 2011.
Google Scholar
N. V. Yushmanov, The structure of the Arabic language. 1961.
Google Scholar
K. Versteegh, The arabic language. Edinburgh University Press, 2014.
Google Scholar

Download references

Acknowledgement

The work presented in this chapter was kindly supported by a CNRS-L “Grant Research Programme 2016” (Lebanon) fund.

Author information

Authors and Affiliations

School of Engineering, Lebanese International University, Beirut, Lebanon
Mohamad Raad, Majida Bayan, Yihya Dalloul, Majd Ghareeb & Amin Haj-Ali

Authors

Mohamad Raad
View author publications
You can also search for this author in PubMed Google Scholar
Majida Bayan
View author publications
You can also search for this author in PubMed Google Scholar
Yihya Dalloul
View author publications
You can also search for this author in PubMed Google Scholar
Majd Ghareeb
View author publications
You can also search for this author in PubMed Google Scholar
Amin Haj-Ali
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Engineering, Qatar University, Doha, Qatar
Jihad Mohamad Alja’am
Faculty of Engineering, University of Ottawa, Ottawa, ON, Canada
Abdulmotaleb El Saddik
Electronic and Computer Engineering, Brunel University London, Uxbridge, UK
Abdul Hamid Sadka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raad, M., Bayan, M., Dalloul, Y., Ghareeb, M., Haj-Ali, A. (2018). Arabic Multimedia Search Platform. In: Alja’am, J., El Saddik, A., Sadka, A. (eds) Recent Trends in Computer Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-89914-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-89914-5_14
Published: 20 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89913-8
Online ISBN: 978-3-319-89914-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics