Computer Vision for Natural Interfaces

Del Bimbo, Alberto; Ferracani, Andrea; Pezzatini, Daniele; Seidenari, Lorenzo

doi:10.1007/978-3-319-61036-8_3

Alberto Del Bimbo⁷,
Andrea Ferracani⁸,
Daniele Pezzatini⁸ &
…
Lorenzo Seidenari⁸

Part of the book series: Human–Computer Interaction Series ((BRIEFSHUMAN))

371 Accesses

Abstract

Depth cameras simplify many tasks in computer vision, such as background modeling, 3D reconstruction, articulated object tracking, and gesture analysis. These sensors provide a great tool for real-time analysis of human behavior. In this chapter, we cover two important issues that can be solved using computer vision for natural interaction. First, we show how we can address the issue of coarse hand pose recognition at a distance, allowing a user to perform common gestures such as picking, dragging, and clicking without the aid of any remote. Second, we deal with the challenging task of long-term re-identification. In the typical approach, person re-identification is performed using appearance, thus invalidating any application in which a person may change dress across subsequent acquisitions. For example, this is a relevant scenario for home patient monitoring. Unfortunately, face and skeleton quality is not always enough to grant a correct recognition from depth data. Both features are affected by the pose of the subject and the distance from the camera. We propose a model to incorporate a robust skeleton representation with a highly discriminative face feature, weighting samples by their quality (Part of this chapter previously appeared in Bagdanov et al. (Real-time hand status recognition from RGB-D imagery, pp. 2456–2012 [1]) and in Bondi et al. (Long termperson re-identification 488 from depth cameras using facial and skeleton data, 2016 [2]) with permission of Springer.).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Part of this chapter previously appeared in [4] 2014 Association for Computing Machinery, Inc. Reprinted by permission.
2.
The Florence 3D Re-Id dataset is released for public use at the following link http://www.micc.unifi.it/.....
3.
http://vimeo.com/38687694.
4.
http://vimeo.com/38687794.

References

Bagdanov AD, Del Bimbo A, Seidenari L, Usai L (2012) Real-time hand status recognition from RGB-D imagery. In: Proceedings of ICPR, pp 2456–2459
Google Scholar
Bondi E, Pala P, Seidenari L, Berretti S, Del Bimbo A (2016) Long term person re-identification from depth cameras using facial and skeleton data. In: Proceedings of UHA3DS workshop in conjunction with ICPR
Google Scholar
Alon J, Athitsos V, Yuan Q, Sclaroff S (2009) A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans Pattern Anal Mach Intell 31(9):1685–1699
Google Scholar
Ferracani A, Pezzatini D, Del Bimbo A (2014) A natural and immersive virtual interface for the surgical safety checklist training. In: Proceedings of the 2014 ACM international workshop on serious games. 2014 ACM Inc. pp 27–32. https://doi.org/10.1145/2656719.2656725
Oikonomidis I, Kyriazis N, Argyros A (2011) Efficient model-based 3D tracking of hand articulations using kinect. In: Proceedings of the 22nd British machine vision conference
Google Scholar
Suryanarayan P, Subramanian A, Mandalapu D (2010) Dynamic hand pose recognition using depth data. In: Proceedings of ICPR
Google Scholar
Ren Z, Yuan J, Zhang Z (2011) Robust hand gesture recognition based on finger earth mover’s. In: Proceedings of ACM MM
Google Scholar
Zheng WS, Gong S, Xiang T (2011) Person re-identification by probabilistic relative distance comparison. In: IEEE conference on computer vision and pattern recognition (CVPR). Colorado Springs, CO, USA, pp 649–656
Google Scholar
Lisanti G, Masi I, Bagdanov A, Del Bimbo A (2015) Person re-identification by iterative re-weighted sparse ranking. IEEE Trans Pattern Anal Mach Intell 37(8):1629–1642
Google Scholar
Lisanti G, Masi I, Bagdanov A, Del Bimbo A (2015) Person re-identification by iterative re-weighted sparse ranking. IEEE Trans Pattern Anal Mach Intell 37(8):1629–1642
Google Scholar
Barbosa BI, Cristani M, Del Bue A, Bazzani L, Murino V (2012) Re-identification with RGB-D sensors. In: International workshop on re-identification in European conference on computer vision (ECCV) workshops and demonstrators. vol 7583, LNCS. Springer, Florence, Italy, pp 433–442
Google Scholar
Pala F, Satta R, Fumera G, Roli F (2016) Multimodal person re-identification using RGB-D cameras. IEEE Trans Circuits Syst Video Technol 26(4):788–799
Google Scholar
Satta R, Pala F, Fumera G, Roli F (2013) Real-time appearance-based person re-identification over multiple KinectTMcameras. In: International conference on computer vision theory and applications (VISAPP), pp 407–410
Google Scholar
Baltieri D, Vezzani R, Cucchiara R (2014) Mapping appearance descriptors on 3D body models for people re-identification. Int J Comput Vis 111(3):345–364
Article MathSciNet Google Scholar
Munaro M, Basso A, Fossati A, Gool LV, Menegatti E (2014) 3D Reconstruction of freely moving persons for re-identification with a depth sensor. In: IEEE international conference on robotics and automation (ICRA). Hong-Kong, pp 4512–4519
Google Scholar
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from a single depth image. In: Proceedings of CVPR
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Hu MK (1962) Visual pattern recognition by moment invariants. Trans Inf Theory 8:179–187
Google Scholar
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) SURF: Speeded up robust features. Comput Vis Image Underst 110(3):346–359
Article Google Scholar
Berretti S, Pala P, Del Bimbo A (2014) Face recognition by super-resolved 3D models from consumer depth cameras. IEEE Trans Inf Forensics Secur 9(9):1436–1449
Article Google Scholar
Rusinkiewicz S, Levoy M (2001) Efficient variants of the ICP algorithm. In: Proceedings of international conference on 3D digital imaging and modeling (3DIM). Quebec City, Canada, pp 145–152
Google Scholar
Ferrari C, Lisanti G, Berretti S, Del Bimbo A (2015) Dictionary learning based 3D morphable model construction for face recognition with varying expression and pose. In: International of Conference on 3D Vision (3DV). Lion, France, pp 509–517
Google Scholar
Berretti S, Del Bimbo A, Pala P (2013) Sparse matching of salient facial curves for recognition of 3D faces with missing parts. IEEE Trans Inf Forensics Secur 8(2):374–389
Google Scholar
Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A et al (2013) Efficient human pose estimation from single depth images. IEEE Trans Pattern Anal Mach Intell 35(12):2821–2840
Google Scholar
Rafi U, Gall J, Leibe B (2015) A semantic occlusion model for human pose estimation from a single depth image. In: IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 67–74
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering, University of Florence, Florence, Italy
Alberto Del Bimbo
Media Integration and Communication Center, University of Florence, Florence, Italy
Andrea Ferracani, Daniele Pezzatini & Lorenzo Seidenari

Authors

Alberto Del Bimbo
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Ferracani
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Pezzatini
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Seidenari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Del Bimbo .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Del Bimbo, A., Ferracani, A., Pezzatini, D., Seidenari, L. (2017). Computer Vision for Natural Interfaces. In: Natural Interaction in Medical Training. Human–Computer Interaction Series(). Springer, Cham. https://doi.org/10.1007/978-3-319-61036-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-61036-8_3
Published: 29 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61035-1
Online ISBN: 978-3-319-61036-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics