Skip to main content

Massive Semantic Video Annotation in High-End Customer Service

Example in Airline Service Value Assessment

  • Conference paper
  • First Online:
HCI in Business, Government and Organizations (HCII 2020)

Abstract

In high-end hospitality industries such as airline lounges, high star hotels, and high-class restaurants, employee service skills play an important role as an element of the brand identity. However, it is very difficult to train an intermediate employee into an expert employee who can provide higher value services which exceed customers’ expectations. To hire and develop employees who embody the value of the brand, it is necessary to clearly communicate the value of the brand to their employees. In the video analysis domain, especially analyzing human behaviors, an important task is the understanding and representation of human activities such as conversation, physical actions and their connections on the time. This paper addresses the problem of massively annotating video contents such as multimedia training materials, which then can be processed by human-interaction training support systems (such as VR training systems) as resources for content generation. In this paper, we propose a POC (proof of concept) system of a service skill assessing platform, which is a knowledge graph (KG) of high-end service provision videos massively annotated with human interaction semantics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anne Hendricks, L., et al.: Localizing moments in video with natural language. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5803–5812 (2017)

    Google Scholar 

  2. Brugman, H., Russel, A., Nijmegen, X.: Annotating multi-media/multi-modal resources with ELAN. In: LREC (2004)

    Google Scholar 

  3. CEN/TS 16880:2015 Service Excellence

    Google Scholar 

  4. Chandan, G., Jain, A., Jain, H., et al.: Real time object detection and tracking using deep learning and OpenCV. In: 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 1305–1308. IEEE (2018)

    Google Scholar 

  5. Das, S., et al.: A new hybrid architecture for human activity recognition from RGB-D videos. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 493–505. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_40

    Chapter  Google Scholar 

  6. Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of human actions in video. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1491–1498. IEEE (2009)

    Google Scholar 

  7. Fuhl, W., et al.: MAM: transfer learning for fully automatic video annotation and specialized detector creation. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 375–388. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_23

    Chapter  Google Scholar 

  8. Huang, K., Delany, S.J., McKeever, S.: Human action recognition in videos using transfer learning. In: IMVIP 2019: Irish Machine Vision & Image Processing, Technological University Dublin, Dublin, Ireland, 28–30 August 2019. https://doi.org/10.21427/mfrv-ah30

  9. Moon, G., Chang, J.Y., Lee, K.M.: Camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10133–10142 (2019)

    Google Scholar 

  10. Nishimura, S., Oota, Y., Fukuda, K.: Ontology construction for annotating skill and situation of airline services to multi-modal data. In: Proceedings of International Conference on Human-Computer Interaction (2020, in press)

    Google Scholar 

  11. Oliphant, T.E.: Python for scientific computing. Comput. Sci. Eng. 9(3), 10–20 (2007)

    Article  Google Scholar 

  12. Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68234-9_39

    Chapter  Google Scholar 

  13. Song, J., et al.: Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans. Image Process. 25(11), 4999–5011 (2016)

    Article  MathSciNet  Google Scholar 

  14. Stuart, F.I., Tax, S.: Toward an integrative approach to designing service experiences lessons learned from the theatre. J. Oper. Manage. 22, 609–627 (2004)

    Google Scholar 

  15. Thomas, A.O., Antonenko, P.D., Davis, R.: Understanding metacomprehension accuracy within video annotation systems. Comput. Hum. Behav. 58, 269–277 (2016)

    Article  Google Scholar 

  16. Villazon-Terrazas, B., et al.: Knowledge graph foundations. Exploiting Linked Data and Knowledge Graphs in Large Organisations, pp. 17–55. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-45654-6_2

    Chapter  Google Scholar 

  17. Xiao, Y., Chen, J., Wang, Y., Cao, Z., Zhou, J.T., Bai, X.: Action recognition for depth video using multi-view dynamic images. Inf. Sci. 480, 287–304 (2019)

    Article  Google Scholar 

  18. Xu, Y., Dong, J., Zhang, B., Xu, D.: Background modeling methods in video analysis: a review and comparative evaluation. CAAI Trans. Intell. Technol. 1(1), 43–60 (2016)

    Article  Google Scholar 

  19. Zhang, K., Chao, W.-L., Sha, F., Grauman, K.: Video summarization with long short-term memory. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 766–782. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_47

    Chapter  Google Scholar 

Download references

Acknowledgments

Part of this work was supported by Council for Science, Technology and Innovation, “Cross-ministerial Strategic Innovation Promotion Program (SIP), Big-data and AI-enabled Cyberspace Technologies” (funding agency: NEDO).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ken Fukuda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fukuda, K., Vizcarra, J., Nishimura, S. (2020). Massive Semantic Video Annotation in High-End Customer Service. In: Nah, FH., Siau, K. (eds) HCI in Business, Government and Organizations. HCII 2020. Lecture Notes in Computer Science(), vol 12204. Springer, Cham. https://doi.org/10.1007/978-3-030-50341-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-50341-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-50340-6

  • Online ISBN: 978-3-030-50341-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics