Skip to main content

Arabic Multimedia Search Platform

  • Conference paper
  • First Online:
Recent Trends in Computer Applications

Abstract

The ability to locate relevant information in cyberspace has been the great value-add of modern search engines. The same approach has also been applied to files stored on local memory to great effect. The ever-increasing volume of information means that the need for effective search engines will only increase with time.

Most search engines focus on locating textual information stored in accessible file formats. The search through other modes of information representation, such as image and video, has also been gaining prominence recently with varying degrees of success. However, one format of information storage is still not prominently accessible through search engines, namely multimedia files that store audio and video content.

Examples of information stored in such files include lectures, news broadcasts and voice notes. The information contained within such files is of a type that is typically sought. For example, people who rely on voice notes would benefit from a platform that allowed them to search quickly through their notes and hyperlink to the relevant audio segments. Students searching for specific material in many video lectures would benefit from being able to quickly search, link to and watch the relevant segments. Similarly, researchers and reporters looking for references about specific topics in news broadcasts, for example, would benefit from such a search engine to quickly identify the relevant content to watch or listen to.

The usefulness of such a platform varies depending on the cultural context of the content being searched. Arabic cultures are highly oral, with much of the information sought after by researchers and students being contained in speeches, lectures and interviews. As such, the usefulness of a search engine as described above is quite high when it comes to organizing Arabic information.

This chapter describes the design, development and testing of a web-based platform that allows for searching through Arabic multimedia content. The platform allows for searching through a corpus of video or audio files for keywords and provides hyperlinks to the relevant audio segments where those keywords are mentioned. As such, the platform relies on an Arabic language transcriber. The approach taken in the development of this platform is that of system integration. Whilst this approach has the advantage of enabling the use of existing resources, it has highlighted the need for more accurate and more easily accessible Arabic language transcribers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. J. Burgess and J. Green, “Uses of YouTube: Digital literacy and the growth of knowledge,” in YouTube: Online video and participatory cultureUK: Polity press, 2009, pp. 126–143.

    Google Scholar 

  2. K. Schoeffmann and C. Cobârzan, “An evaluation of interactive search with modern video players,” in Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on, 2013, pp. 1–4: IEEE.

    Google Scholar 

  3. K. Schoeffmann and F. Hopfgartner, “Interactive video search,” in Proceedings of the 23rd ACM international conference on Multimedia, 2015, pp. 1321–1322: ACM.

    Google Scholar 

  4. T. Tommasi et al., “Beyond metadata: searching your archive based on its audio-visual content,” in IBC2014, Amsterdam, Netherlands, 2014, pp. 1–3.

    Google Scholar 

  5. H. Yang and C. Meinel, “Content based lecture video retrieval using speech and video text information,” IEEE Transactions on Learning Technologies, vol. 7, no. 2, pp. 142-154, 2014.

    Article  Google Scholar 

  6. J. D. V. Miró, J. A. Silvestre-Cerdà, J. Civera, C. Turró, and A. Juan, “Efficiency and usability study of innovative computer-aided transcription strategies for video lecture repositories,” Speech Communication, vol. 74, pp. 65–75, 2015.

    Article  Google Scholar 

  7. M. Eskevich et al., “Multimedia information seeking through search and hyperlinking,” in Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, 2013, pp. 287–294: ACM.

    Google Scholar 

  8. M. Eskevich, R. Aly, D. Racca, R. Ordelman, S. Chen, and G. J. Jones, “The search and hyperlinking task at MediaEval 2014,” 2014.

    Google Scholar 

  9. R. Kubat, P. DeCamp, B. Roy, and D. Roy, “Totalrecall: visualization and semi-automatic annotation of very large audio-visual corpora,” in ICMI, 2007, vol. 7, pp. 208-215.

    Google Scholar 

  10. S. Kairam, N. H. Riche, S. Drucker, R. Fernandez, and J. Heer, “Refinery: Visual exploration of large, heterogeneous networks through associative browsing,” in Computer Graphics Forum, 2015, vol. 34, no. 3, pp. 301–310: Wiley Online Library.

    Google Scholar 

  11. G. Awad et al., “Trecvid 2016: Evaluating video search, video event detection, localization, and hyperlinking,” in Proceedings of TRECVID, 2016, vol. 2016.

    Google Scholar 

  12. A. Chakraborty, P. Liu, and L. Hsu, “Method and apparatus for authoring and linking video documents,” USA Patent US6462754 B1, Oct 8, 2002.

    Google Scholar 

  13. P. Galuščáková and P. Pecina, “Audio Information for Hyperlinking of TV Content,” in Proceedings of the Third Edition Workshop on Speech, Language & Audio in Multimedia, 2015, pp. 27-30: ACM.

    Google Scholar 

  14. Z. Cheng, X. Li, J. Shen, and A. G. Hauptmann, “Which information sources are more effective and reliable in video search,” in Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016, pp. 1069–1072: ACM.

    Google Scholar 

  15. R. J. Ordelman, M. Eskevich, R. Aly, B. Huet, and G. Jones, “Defining and evaluating video hyperlinking for navigating multimedia archives,” in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 727–732: ACM.

    Google Scholar 

  16. M. Garnier-Rizet et al., “CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and Query by Content,” in LREC, 2008.

    Google Scholar 

  17. C. Barras, A. Allauzen, L. Lamel, and J.-L. Gauvain, “Transcribing audio-video archives,” in Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on, 2002, vol. 1, pp. I-13–I-16: IEEE.

    Google Scholar 

  18. C. Gaida, P. Lange, R. Petrick, P. Proba, A. Malatawy, and D. Suendermann-Oeft, “Comparing open-source speech recognition toolkits,” Tech. Rep., DHBW Stuttgart, 2014.

    Google Scholar 

  19. P. Lamere et al., “The CMU SPHINX-4 speech recognition system,” in IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2003), Hong Kong, 2003, vol. 1, pp. 2–5.

    Google Scholar 

  20. D. Povey et al., “The Kaldi Speech Recognition Toolkit,” presented at the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US, 2011.

    Google Scholar 

  21. O. Mubin, C. Bartneck, L. Feijs, H. Hooft van Huysduynen, J. Hu, and J. Muelver, “Improving speech recognition with the robot interaction language,” Disruptive Science and Technology, vol. 1, no. 2, pp. 79–88, 2012.

    Article  Google Scholar 

  22. B. C. Roy and D. Roy, “Fast transcription of unstructured audio recordings,” presented at the The 10th Annual Conference of the International Speech Communication Association, INTERSPEECH, Brighton, UK, September, 2009, 2009. Available: http://hdl.handle.net/1721.1/67363

  23. K. Almeman, M. Lee, and A. A. Almiman, “Multi dialect Arabic speech parallel corpora,” in Communications, Signal Processing, and their Applications (ICCSPA), 2013 1st International Conference on, 2013, pp. 1–6: IEEE.

    Google Scholar 

  24. K. Almeman. (December 25). Arabic resources. Available: http://www.almeman.com/arabic-resources.html

  25. N. Halabi. (2017, December 25). Arabic speech corpus. Available: http://en.arabicspeechcorpus.com/

  26. A. Chotimongkol, K. Saykhum, P. Chootrakool, N. Thatphithakkul, and C. Wutiwiwatchai, “LOTUS-BN: A Thai broadcast news corpus and its research applications,” in Speech Database and Assessments, 2009 Oriental COCOSDA International Conference on, 2009, pp. 44–50: IEEE.

    Google Scholar 

  27. R. Fromont and J. Hay, “ONZE Miner: the development of a browser-based research tool,” Corpora, vol. 3, no. 2, pp. 173–193, 2008.

    Article  Google Scholar 

  28. F. Torreira, M. Adda-Decker, and M. Ernestus, “The Nijmegen corpus of casual French,” Speech Communication, vol. 52, no. 3, pp. 201–212, 2010.

    Article  Google Scholar 

  29. A. Vandecatseye et al., “The COST278 Pan-European Broadcast News Database,” in LREC, 2004.

    Google Scholar 

  30. H.-M. Wang, B. Chen, J.-W. Kuo, and S.-S. Cheng, “MATBN: A Mandarin Chinese broadcast news corpus,” International Journal of Computational Linguistics and Chinese Language Processing, vol. 10, no. 2, pp. 219–236, 2005.

    Google Scholar 

  31. C. Barras, E. Geoffrois, Z. Wu, and M. Liberman, “Transcriber: development and use of a tool for assisting speech corpora production,” Speech Communication, vol. 33, no. 1, pp. 5–22, 2001.

    Article  Google Scholar 

  32. K. Boudahmane, M. Manta, F. Antoine, S. Galliano, and C. Barras. (2008, December, 25). Transcriber. Available: http://trans.sourceforge.net/en/presentation.php

  33. DGA. (2014, December, 25). TranscriberAG. Available: http://transag.sourceforge.net/

  34. B. Bigi, “SPPAS-multi-lingual approaches to the automatic annotation of speech,” Phonetician, vol. 2015-I-II, no. 111-112, pp. 54 - 69, 2015.

    Google Scholar 

  35. J. Carletta, S. Evert, U. Heid, J. Kilgour, J. Robertson, and H. Voormann, “The NITE XML toolkit: flexible annotation for multimodal language data,” Behavior Research Methods, vol. 35, no. 3, pp. 353–363, 2003.

    Article  Google Scholar 

  36. M. N. Al Laham, I. Ayass, M. Ghareeb, Z. El-Bazzal, and M. Raad, “Audio indexing for YouTube,” in Digital Information and Communication Technology and its Applications (DICTAP), 2015 Fifth International Conference on, 2015, pp. 111–114: IEEE.

    Google Scholar 

  37. W. Walker et al., “Sphinx-4: A Flexible Open Source Framework for Speech Recognition,” Sun Microsystems, Inc., Technical Report 2004.

    Google Scholar 

  38. (2017). FFMPEG. Available: http://ffmpeg.org/

  39. IBM. (2017). IBM Watson speech to text cloud service. Available: https://www.ibm.com/watson/services/speech-to-text/

  40. J. S. Hare, S. Samangooei, and D. P. Dupplaw, “OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images,” presented at the Proceedings of the 19th ACM international conference on Multimedia, Scottsdale, Arizona, USA, 2011.

    Google Scholar 

  41. N. V. Yushmanov, The structure of the Arabic language. 1961.

    Google Scholar 

  42. K. Versteegh, The arabic language. Edinburgh University Press, 2014.

    Google Scholar 

Download references

Acknowledgement

The work presented in this chapter was kindly supported by a CNRS-L “Grant Research Programme 2016” (Lebanon) fund.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raad, M., Bayan, M., Dalloul, Y., Ghareeb, M., Haj-Ali, A. (2018). Arabic Multimedia Search Platform. In: Alja’am, J., El Saddik, A., Sadka, A. (eds) Recent Trends in Computer Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-89914-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-89914-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-89913-8

  • Online ISBN: 978-3-319-89914-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics