Skip to main content

Automatic Labanotation Generation, Semi-automatic Semantic Annotation and Retrieval of Recorded Videos

  • Conference paper
  • First Online:
Maturity and Innovation in Digital Libraries (ICADL 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11279))

Included in the following conference series:

Abstract

Over the last decade, the volume of unannotated user-generated web content has skyrocketed but manually annotating data is costly in terms of time and resources. We leverage the advancements in Machine Learning to reduce these costs. We create a semantically searchable dance database with automatic annotation and retrieval. We use a pose estimation module to retrieve body pose and generate Labanotation over recorded videos. Though generic, it provides an essential application due to large amount of videos available online. Labanotation can be further exploited to generate ontology and is also very relevant for preservation and digitization of such resources. We also propose a semi-automatic annotation model which generates semantic annotations over any video archive using only 2–4 manually annotated clips. We experiment on two publicly available ballet datasets. High-level concepts such as ballet pose and steps are used to make the semantic library. These also act as descriptive meta-tags making the videos retrievable using a semantic text or video query.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    [https://shubhamagarwalwork.wixsite.com/dancelib] At present, this web-page is a prototype and does not support video queries but that will be extended soon with more data.

  2. 2.

    https://drive.google.com/open?id=1lG0j0td7pD6QBAcUxd9CKPIAllFbw0Rp.

References

  1. Dewan, S., Agarwal, S., Singh, N.: Spatio-temporal Laban features for dance style recognition. In: ICPR, Beijing, China (2018)

    Google Scholar 

  2. Laban, R., Ullmann, L.: The Mastery of Movement. ERIC, Plays, Inc., Boston (1971)

    Google Scholar 

  3. Wang, Y., Mori, G.: Human action recognition by semi-latent topic models. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1762–1774 (2009)

    Article  Google Scholar 

  4. Sgouramani, E., Vatakis, A.: “Flash” dance: how speed modulates perceived duration in dancers and non-dancers. Acta Psychol. 147, 17–24 (2014)

    Article  Google Scholar 

  5. Vatakis, A., Sgouramani, E., Gorea, A., Hatzitaki, V., Pollick, F.E.: Time to act: new perspectives on embodiment and timing. Procedia - Soc. Behav. Sci. 126, 16–20 (2014)

    Article  Google Scholar 

  6. Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2014)

    Article  Google Scholar 

  7. Ionescu, C., Li, F., Sminchisescu, C.: Latent structured models for human pose estimation. In: International Conference on Computer Vision (2011)

    Google Scholar 

  8. Zhou, X., Huang, Q., Sun, X., Xue, X., Wei, Y.: Towards 3D human pose estimation in the wild: a weakly-supervised approach. In: The IEEE International Conference on Computer Vision (ICCV), October 2017

    Google Scholar 

  9. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR (2004)

    Google Scholar 

  10. Samanta, S., Purkait, P., Chanda, B.: Indian classical dance classification by learning dance pose bases. In: 2012 IEEE Workshop on the Applications of Computer Vision (WACV), Breckenridge, CO, pp. 265–270 (2012)

    Google Scholar 

  11. Ma, C.-Y., Chen, M.-H., Kira, Z., AlRegib, G.: TS-LSTM and temporal-inception: exploiting spatio-temporal dynamics for activity recognition. CoRR, abs/1703.10667 (2017)

    Google Scholar 

  12. Aristidou, A., Stavrakis, E., Charalambous, P., Chrysanthou, Y., Himona, S.L.: Folk dance evaluation using Laban movement analysis. J. Comput. Cult. Herit. (JOCCH) 8(4), 20:1–20:19 (2015)

    Google Scholar 

  13. Aristidou, A., Chrysanthou, Y.: Motion indexing of different emotional states using LMA components. In: SIGGRAPH Asia 2013 Technical Briefs (SA 2013), pp. 21:1–21:4. ACM, New York (2013)

    Google Scholar 

  14. Handschuh, S., Staab, S., Ciravegna, F.: S-CREAM—Semi-automatic CREAtion of metadata. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 358–372. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45810-7_32

    Chapter  Google Scholar 

  15. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). https://doi.org/10.1016/j.artint.2012.07.001

    Article  MathSciNet  MATH  Google Scholar 

  16. Ballan, L., Bertini, M., Bimbo, A., Seidenari, L., Serra, G.: Event detection and recognition for semantic annotation of video. Multimed. Tools Appl. 51(1), 279–302 (2011). https://doi.org/10.1007/s11042-010-0643-7

    Article  Google Scholar 

  17. Yildirim, Y., Yazici, A., Yilmaz, T.: Automatic semantic content extraction in videos using a fuzzy ontology and rule-based model. IEEE Trans. Knowl. Data Eng. 25(1), 47–61 (2013)

    Article  Google Scholar 

  18. Raheb, K.E.: Dance ontology: towards a searchable knowledge base. In: Workshop on Movement Qualities and Physical Models Visualization, IRCAM Centre Pompidou, Paris (2012)

    Google Scholar 

  19. El Raheb, K., Mailis, T., Ryzhikov, V., Papapetrou, N., Ioannidis, Y.: BalOnSe: temporal aspects of dance movement and its ontological representation. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10250, pp. 49–64. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58451-5_4

    Chapter  Google Scholar 

  20. Dewan, S., Agarwal, S., Singh, N.: Laban movement analysis to classify emotions from motion. In: ICMV, Vienna, Austria (2017)

    Google Scholar 

  21. Lea, C., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks: a unified approach to action segmentation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 47–54. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_7

    Chapter  Google Scholar 

  22. Balakrishnan, R., Rajkumar, K.: Semi-automated annotation and retrieval of dance media objects. Cybern. Syst. 38(4), 349–379 (2007). https://doi.org/10.1080/01969720701291189

    Article  Google Scholar 

  23. Choensawat, W., Nakamura, M., Hachimura, K.: GenLaban: a tool for generating Labanotation from motion capture data. Multimed. Tools Appl. 74, 10823 (2015). https://doi.org/10.1007/s11042-014-2209-6

    Article  Google Scholar 

  24. Jalal, A., Kamal, S., Kim, D.: A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments. Sensors 14, 11735–11759 (2014)

    Article  Google Scholar 

  25. Jalal, A., Sarif, N., Kim, J.T., Kim, T.S.: Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. Indoor Built Environ. 22, 271–279 (2013)

    Article  Google Scholar 

  26. Li, J., Allinson, N.: Building recognition using local oriented features. IEEE Trans. Industr. Inform. 9(3), 1697–1704 (2013)

    Article  Google Scholar 

  27. Jalal, A., Kamal, S., Kim, D.: Shape motion features approach for activity tracking and recognition from kinect video camera. In: IEEE 29th International Conference on Advanced Information Networking and Applications Workshops, Gwangju, pp. 445–450 (2015)

    Google Scholar 

  28. Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: ECCV (2016)

    Chapter  Google Scholar 

  29. F1eichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swati Dewan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dewan, S., Agarwal, S., Singh, N. (2018). Automatic Labanotation Generation, Semi-automatic Semantic Annotation and Retrieval of Recorded Videos. In: Dobreva, M., Hinze, A., Žumer, M. (eds) Maturity and Innovation in Digital Libraries. ICADL 2018. Lecture Notes in Computer Science(), vol 11279. Springer, Cham. https://doi.org/10.1007/978-3-030-04257-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04257-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04256-1

  • Online ISBN: 978-3-030-04257-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics