Abstract
In high-end hospitality industries such as airline lounges, high star hotels, and high-class restaurants, employee service skills play an important role as an element of the brand identity. However, it is very difficult to train an intermediate employee into an expert employee who can provide higher value services which exceed customers’ expectations. To hire and develop employees who embody the value of the brand, it is necessary to clearly communicate the value of the brand to their employees. In the video analysis domain, especially analyzing human behaviors, an important task is the understanding and representation of human activities such as conversation, physical actions and their connections on the time. This paper addresses the problem of massively annotating video contents such as multimedia training materials, which then can be processed by human-interaction training support systems (such as VR training systems) as resources for content generation. In this paper, we propose a POC (proof of concept) system of a service skill assessing platform, which is a knowledge graph (KG) of high-end service provision videos massively annotated with human interaction semantics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anne Hendricks, L., et al.: Localizing moments in video with natural language. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5803–5812 (2017)
Brugman, H., Russel, A., Nijmegen, X.: Annotating multi-media/multi-modal resources with ELAN. In: LREC (2004)
CEN/TS 16880:2015 Service Excellence
Chandan, G., Jain, A., Jain, H., et al.: Real time object detection and tracking using deep learning and OpenCV. In: 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 1305–1308. IEEE (2018)
Das, S., et al.: A new hybrid architecture for human activity recognition from RGB-D videos. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 493–505. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_40
Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of human actions in video. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1491–1498. IEEE (2009)
Fuhl, W., et al.: MAM: transfer learning for fully automatic video annotation and specialized detector creation. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 375–388. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_23
Huang, K., Delany, S.J., McKeever, S.: Human action recognition in videos using transfer learning. In: IMVIP 2019: Irish Machine Vision & Image Processing, Technological University Dublin, Dublin, Ireland, 28–30 August 2019. https://doi.org/10.21427/mfrv-ah30
Moon, G., Chang, J.Y., Lee, K.M.: Camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10133–10142 (2019)
Nishimura, S., Oota, Y., Fukuda, K.: Ontology construction for annotating skill and situation of airline services to multi-modal data. In: Proceedings of International Conference on Human-Computer Interaction (2020, in press)
Oliphant, T.E.: Python for scientific computing. Comput. Sci. Eng. 9(3), 10–20 (2007)
Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68234-9_39
Song, J., et al.: Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans. Image Process. 25(11), 4999–5011 (2016)
Stuart, F.I., Tax, S.: Toward an integrative approach to designing service experiences lessons learned from the theatre. J. Oper. Manage. 22, 609–627 (2004)
Thomas, A.O., Antonenko, P.D., Davis, R.: Understanding metacomprehension accuracy within video annotation systems. Comput. Hum. Behav. 58, 269–277 (2016)
Villazon-Terrazas, B., et al.: Knowledge graph foundations. Exploiting Linked Data and Knowledge Graphs in Large Organisations, pp. 17–55. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-45654-6_2
Xiao, Y., Chen, J., Wang, Y., Cao, Z., Zhou, J.T., Bai, X.: Action recognition for depth video using multi-view dynamic images. Inf. Sci. 480, 287–304 (2019)
Xu, Y., Dong, J., Zhang, B., Xu, D.: Background modeling methods in video analysis: a review and comparative evaluation. CAAI Trans. Intell. Technol. 1(1), 43–60 (2016)
Zhang, K., Chao, W.-L., Sha, F., Grauman, K.: Video summarization with long short-term memory. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 766–782. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_47
Acknowledgments
Part of this work was supported by Council for Science, Technology and Innovation, “Cross-ministerial Strategic Innovation Promotion Program (SIP), Big-data and AI-enabled Cyberspace Technologies” (funding agency: NEDO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fukuda, K., Vizcarra, J., Nishimura, S. (2020). Massive Semantic Video Annotation in High-End Customer Service. In: Nah, FH., Siau, K. (eds) HCI in Business, Government and Organizations. HCII 2020. Lecture Notes in Computer Science(), vol 12204. Springer, Cham. https://doi.org/10.1007/978-3-030-50341-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-50341-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50340-6
Online ISBN: 978-3-030-50341-3
eBook Packages: Computer ScienceComputer Science (R0)