Human action annotation, modeling and analysis based on implicit user interaction

Ntalianis, Klimis S.; Doulamis, Anastasios D.; Tsapatsoulis, Nicolas; Doulamis, Nikolaos

doi:10.1007/s11042-009-0369-6

Human action annotation, modeling and analysis based on implicit user interaction

Published: 09 October 2009

Volume 50, pages 199–225, (2010)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Klimis S. Ntalianis¹,
Anastasios D. Doulamis²,
Nicolas Tsapatsoulis³ &
…
Nikolaos Doulamis¹

232 Accesses
10 Citations
Explore all metrics

Abstract

This paper proposes an integrated framework for analyzing human actions in video streams. Despite most current approaches that are just based on automatic spatiotemporal analysis of sequences, the proposed method introduces the implicit user-in-the-loop concept for dynamically mining semantics and annotating video streams. This work sets a new and ambitious goal: to recognize, model and properly use “average user’s” selections, preferences and perception, for dynamically extracting content semantics. The proposed approach is expected to add significant value to hundreds of billions of non-annotated or inadequately annotated video streams existing in the Web, file servers, databases etc. Furthermore expert annotators can gain important knowledge relevant to user preferences, selections, styles of searching and perception.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Video Editor for Annotating Human Actions and Object Trajectories

Social Video Retrieval: Research Methods in Controlling, Sharing, and Editing of Web Video

Actionlets and Activity Prediction

References

Assfalg J, Bertini M, Colombo C, Bimbo AD (2002) Semantic annotation of sports videos. IEEE Multimed 9(2):52–60
Article Google Scholar
Bader BW, Kolda TG (2006) Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans Math Softw 32(4):635–653
Article MathSciNet Google Scholar
Bagdanov AD, Bertini M, Bimbo A, Serra G, Torniai C (2007) Semantic annotation and retrieval of video events using multimedia ontologies. Proceedings of the 1st International Conference on Semantic Computing. Irvine, CA, pp 713–720, September
Bertini M, Cucchiara R, del Bimbo A, Torniai C (2005) Video annotation with pictorially enriched ontologies. Proceedings of the 2005 IEEE International Conference on Multimedia and Expo. Amsterdam, Netherlands, pp 1428–1431, July
Bhattacharya A, Ljosa V, Pan J-Y, Verardo MR, Yang H, Faloutsos C, Singh AK (2005) ViVo: Visual Vocabulary construction for mining biomedical images. Proceedings of the 5th IEEE International Conference on Data Mining, Houston, Texas, November
“comScore’s qSearch 2.0 service”, comScore’s report article. Online at: http://www.comscore.com (last access 15/1/2009)
de Lathauwer L, de Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278
Article MATH MathSciNet Google Scholar
Deng Y, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell (PAMI ’01) 23(8):800–810
Article Google Scholar
Doulamis N, Doulamis A (2006) Evaluation of relevance feedback schemes in content-based retrieval systems. Signal Process Image Comm 21(4):334–357
Article Google Scholar
Doulamis AD, Doulamis ND, Kollias SD (2000) On line retrainable neural networks: improving the performance of neural network in image analysis problems. IEEE Trans Neural Netw 11(1):137–155
Article Google Scholar
Doulamis A, Doulamis N, Kollias S (2000) Non-sequential video content representation using temporal variation of feature vectors. IEEE Trans Consum Electron 46(3):758–768
Article Google Scholar
Fan J, Luo H, Gao Y, Jain R (2007) Incorporating concept ontology for hierarchical video classification, annotation, and visualization. IEEE Trans Multimedia 9(5):939–957
Article Google Scholar
Fan J, Gao Y, Luo H (2008) Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. IEEE Trans Image Process 17(3):407–426
Article MathSciNet Google Scholar
Gao S, Wang D-H, Lee C-H (2006) Automatic image annotation through multi-topic text categorization. Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp II–II, Toulouse, France, May
Harit G, Chaudhury S, Ghosh H (2006) Using multimedia ontology for generating conceptual annotations and hyperlinks in video collections. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence. Hong Kong, pp 211–217, December
Haykin S (1994) Neural networks: a comprehensive foundation. Macmillan, New York
MATH Google Scholar
Haykin S (1996) Adaptive Filter theory, 3rd edn. Prentice Hall, New Jersey
Google Scholar
Jansen BJ, Spink A, Saracevic T (2000) Real life, real users, and real needs: a study and analysis of user queries on the web. Inf Process Manag 36(2):207–227
Article Google Scholar
Joachims T (2002) Optimizing search engines using clickthrough data. Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, pp 133–142, July
Joshi D, Wang JZ, Li J (2006) The story picturing engine—a system for automatic text illustration. ACM Trans Multimed Comput Comm Appl 2(1):68–89
Article Google Scholar
Kolda TG, Sun J (2008) Scalable tensor decompositions for multi-aspect data mining. Proceedings of the 8th IEEE International Conference on Data Mining. Pisa, Italy, December. Online at: http://csmr.ca.sandia.gov/∼tgkolda/pubs/bibtgkfiles/ICDM08-Kolda-Sun-preprint.pdf
Li J, Wang JZ (2008) Real-time computerized annotation of pictures. IEEE Trans Pattern Anal Mach Intell 30(6):985–1002
Article Google Scholar
Moon B, Jagadish HV, Faloutsos C, Salz J (2001) Analysis of the clustering properties of Hilbert space-filling curve. IEEE Trans Knowl Data Eng 13(1):124–141
Article Google Scholar
Nesvadba J (2007) From push-based passive content consumption to pull-based content experiences. Panel presentation in the 8th IEEE International Workshop on Image Analysis for Interactive Multimedia Services, Santorini, Greece. Online at: http://mklab.iti.gr/wiamis2007/files/2007_WIAMIS_Nesvadba_PanelSearchEngines.pdf (last access 15/1/2009)
Petridis S, Tsapatsoulis N (2006) Semantics extraction from multimedia content: the BOEMIE architecture. In: Proceeding of the 1st International conference on Semantics and digital Media Technology (SAMT 2006), Athens, Greece, December. Online at: http://www.cs.ucy.ac.cy/∼nicolast/papers/BOEMIE-SAMT.pdf (last access 15/1/2009)
Petridis K, Kompatsiaris I, Strintzis MG, Bloehdorn S, Handschuh S, Staab S, Simou N, Tzouvaras V, Avrithis Y (2004) Knowledge representation for semantic multimedia content analysis and reasoning. Proceedings of the 2004 European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology. London, UK, pp 33–46, November
Rapantzikos K, Tsapatsoulis N (2005) Enhancing the robustness of skin-based face detection schemes through a visual attention architecture. Proc of the 2005 Int Conf Image Proc 2:1298–1301
Google Scholar
Rapantzikos K, Tsapatsoulis N, Avrithis Y, Kollias S (2007) Bottom-up spatiotemporal visual attention model for video analysis. IET Image Process 1(2):237–248
Article Google Scholar
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proc. ICPR’04, Cambridge, UK
“Search Engine Statistics For 2006–07,” SEO weekly article, Online at: http://www.accuracast.com/seo-weekly/se-statistics.php (last access 15/1/2009)
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Article Google Scholar
Stevenson K, Leung C (2005) Comparative evaluation of web image search engines for multimedia applications. Proceedings of the 2005 IEEE International Conference on Multimedia and Expo. Amsterdam, Netherlands, pp 1194–1197, July
Tsapatsoulis N, Petridis S (2007) Classifying images from athletics based on spatial relations. Proceedings of the 2nd International Workshop on Semantic Media Adaptation and Personalization, pp 92–97, December
Tsapatsoulis N, Avrithis Y, Kollias S (2001) Facial image indexing in multimedia databases. Patt Anal and Appl 4(2/3):93–107
Article MATH MathSciNet Google Scholar
Tsapatsoulis N, Pattichis C, Kounoudes A, Loizou C, Constantinides A, Taylor JG (2006) Visual attention based region of interest coding for video-telephony applications. 5th International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP’06), Patras, Greece, July
Tseng VS, Su J-H, Huang J-H, Chen C-J (2008) Integrated mining of visual features, speech features, and frequent patterns for semantic video annotation. IEEE Trans Multimedia 10(2):260–267
Article Google Scholar
Vasilescu MAO, Terzopoulos D (2004) TensorTextures: multilinear image-based rendering. Proceedings of ACM SIGGRAPH 2004 Conference. Los Angeles, CA, pp 334–340, August
Xu B, Wang P, Lu J, Li Y, Kang D (2004) Bridge ontology and its role in semantic annotation. Proceedings of the 3rd International Conference on Cyberworlds. Tokyo, Japan, pp 329–334, November

Download references

Acknowledgments

This research was performed in the framework of the PSIFIORIKSI project (Audiovisual Content Digitisation and Multimedia Metadata Extraction, Authoring and Storing based on MPEG-7), funded by the Research Promotion Foundation (RPF) of the Republic of Cyprus.

Author information

Authors and Affiliations

Electrical and Computer Engineering Department, National Technical University of Athens, 9, Heroon Polytechniou str., Zografou, 15773, Athens, Greece
Klimis S. Ntalianis & Nikolaos Doulamis
Department of Production Engineering and Management, Technical University of Crete, 73100, Chania, Greece
Anastasios D. Doulamis
Department of Communication and Internet Studies, Cyprus University of Technology, 3603, Limassol, Cyprus
Nicolas Tsapatsoulis

Authors

Klimis S. Ntalianis
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios D. Doulamis
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Tsapatsoulis
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaos Doulamis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klimis S. Ntalianis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ntalianis, K.S., Doulamis, A.D., Tsapatsoulis, N. et al. Human action annotation, modeling and analysis based on implicit user interaction. Multimed Tools Appl 50, 199–225 (2010). https://doi.org/10.1007/s11042-009-0369-6

Download citation

Published: 09 October 2009
Issue Date: October 2010
DOI: https://doi.org/10.1007/s11042-009-0369-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Human action annotation, modeling and analysis based on implicit user interaction

Abstract

Access this article

Similar content being viewed by others

Video Editor for Annotating Human Actions and Object Trajectories

Social Video Retrieval: Research Methods in Controlling, Sharing, and Editing of Web Video

Actionlets and Activity Prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human action annotation, modeling and analysis based on implicit user interaction

Abstract

Access this article

Similar content being viewed by others

Video Editor for Annotating Human Actions and Object Trajectories

Social Video Retrieval: Research Methods in Controlling, Sharing, and Editing of Web Video

Actionlets and Activity Prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation