Skip to main content

Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation

  • Conference paper
  • First Online:
Intelligent Data Engineering and Automated Learning – IDEAL 2019 (IDEAL 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11872))

Abstract

This paper presents a multi-platform Web-based video annotator to support multimodal annotation that can be applied to several working areas, such as dance rehearsals, among others. The CultureMoves’ “Motion-Notes” Annotator was designed to assist the creative and exploratory processes of both professional and amateur users, working with a digital device for personal annotations. This prototype is being developed for any device capable of running in a modern Web browser. It is a real-time multimodal video annotator based on keyboard, touch and voice inputs. Five different ways of adding annotations have been already implemented: voice, draw, text, web URL, and mark annotations. Pose estimation functionality uses machine learning techniques to identify a person skeleton in the video frames, which gives the user another resource to identify possible annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cabral, D., Valente, J., Silva, J., Aragão, U., Fernandes, C., Correia, N.: A creation-tool for contemporary dance using multimodal video annotation. In: Proceedings of the 19th ACM International Conference on Multimedia, MM 2011, pp. 905–908. ACM, New York (2011). http://doi.acm.org/10.1145/2072298.2071899

  2. Silva, J.M.F., Cabral, D., Fernandes, C., Correia, N.: Real-time annotation of video objects on tablet computers. In: MUM 2012, p. 19 (2012)

    Google Scholar 

  3. Cabral, D., Valente, J., Aragão, U., Fernandes, C., Correia, N.: Evaluation of a multimodal video annotator for contemporary dance. In: AVI 2012

    Google Scholar 

  4. Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 13:1–13:45 (2006)

    Article  Google Scholar 

  5. Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: a review. IEEE Trans. Cybernet. 43(5), 1318–1334 (2013). https://doi.org/10.1109/TCYB.2013.2265378

    Article  Google Scholar 

  6. Kawana, Y., Ukita, N., Huang, J.-B., Yang, M.-H.: Ensemble convolutional neural networks for pose estimation. Comput. Vis. Image Underst. 169, 62–74 (2018). https://doi.org/10.1016/j.cviu.2017.12.005. ISSN 1077-3142

    Article  Google Scholar 

  7. PoseNet. https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5. Accessed 31 July 2019

  8. Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017). https://doi.org/10.1109/cvpr.2017.143

  9. Bargeron, D., Gupta, A., Grudin, J., Sanocki, E.: Annotations for streaming video on the Web: system design and usage studies. Comput. Netw. 31(11–16), 1139–1153 (1999). ISSN 1389-1286

    Article  Google Scholar 

  10. Lausberg, H., Sloetjes, H.: Behav. Res. Methods 41, 841 (2009). https://doi.org/10.3758/BRM.41.3.841

    Article  Google Scholar 

  11. Correia, N., Chambel, T.: Active video watching using annotation. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 2) (MULTIMEDIA 1999), pp. 151–154. ACM, New York (1999)

    Google Scholar 

  12. Goldman, D.B., Gonterman, C., Curless, B., Salesin, D., Seitz, S.M.: Video object annotation, navigation, and composition. In: Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology, UIST 2008, New York, USA (2008)

    Google Scholar 

  13. Marshall, C.C.: Toward an ecology of hypertext annotation. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, HYPERTEXT 1998. ACM, New York (1998)

    Google Scholar 

  14. Europeana. https://www.europeana.eu/portal/pt. Accessed 31 July 2019

  15. Stackoverflow. https://insights.stackoverflow.com/survey/2019#most-popular-technologies. Accessed 31 July 2019

Download references

Acknowledgements

This work was supported by the project CultureMoves, Grant Agreement Number: INEA/CEF/ICT/A2017/1568369, Action No: 2017-EU-tA-0171.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Rodrigues .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodrigues, R., Madeira, R.N., Correia, N., Fernandes, C., Ribeiro, S. (2019). Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11872. Springer, Cham. https://doi.org/10.1007/978-3-030-33617-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33617-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33616-5

  • Online ISBN: 978-3-030-33617-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics