Journal of Intelligent & Robotic Systems

, Volume 95, Issue 1, pp 99–117 | Cite as

A Delay-Free and Robust Object Tracking Approach for Robotics Applications

  • Wilma Pairo
  • Patricio LoncomillaEmail author
  • Javier Ruiz del Solar


In many robotic applications there is the need for detecting and tracking moving and/or static objects while the robot moves, in order to interact with them. High quality detection methods require considerable computational time when the number of objects to be detected is high, or when operating within dynamic, real-world environments. Then, when an object detection result is available, it is referred to a previous frame and not to the current one. A method for obtaining delay-free detections is introduced in this present article. It consists of projecting a delayed detection onto the current frame by using a set of feature tracks generated by using the KLT (Kanade-Lucas-Tomasi) tracker. The proposed method is shown to improve detection accuracy when the tracked object is moving with respect to the camera. In addition, the method is able to detect and manage false detections and occlusions using statistical classifiers (Support Vector Machine) and the Viterbi algorithm (Viterbi, IEEE Trans. Inf. Theory 13(2), 260–269 1967). The method is validated in a person-following task, and compared against a part-based HOG person detector, and four performant tracking methods (Meanshift, Compressive Tracking, Tracking-by-detection with Kernels and Kernelized Correlation Filter). Additionally, the method is validated in two additional tasks: face tracking and car tracking. In all reported experiments, the proposed method obtains the best performance among all compared methods.


Delay-free detections Human tracking Human detection Person following by robot “Follow-me” behavior 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was partially funded by FONDECYT Projects 1130153 and 1161500.


  1. 1.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A Discriminatively Trained, Multiscale, Deformable Part Model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8. CVPR 2008 (2008)Google Scholar
  2. 2.
    Tomasi, C., Kanade, T.: Detection and Tracking of Point Features, School of Computer Science. Carnegie Mellon University, Pittsburgh (1991)Google Scholar
  3. 3.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Zhang, K., Zhang, L., Yang, M.H.: Real-Time Compressive Tracking. In: Computer Vision–ECCV 2012, pp. 864–877 (2012)Google Scholar
  5. 5.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the Circulant Structure of Tracking-By-Detection with Kernels. In: Computer Vision–ECCV 2012, pp. 702–715 (2012)Google Scholar
  6. 6.
    Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. vol. 2, pp. 142–149 (2000)Google Scholar
  7. 7.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the 2011 International Conference on Computer Vision, pp. 2564–2571 (2011)Google Scholar
  8. 8.
    Alahi, A., Ortiz, O., Vandergheynst, P.: Freak: Fast Retina Keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517 (2012)Google Scholar
  9. 9.
    Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)CrossRefzbMATHGoogle Scholar
  10. 10.
    Ruiz-Del-Solar, J., Loncomilla, P.: Robot head pose detection and gaze direction determination using local invariant features. Adv. Robot. 23(3), 305–328 (2009)CrossRefGoogle Scholar
  11. 11.
    Loncomilla, P., Ruiz-del-Solar, J., Martínez, L.: Object recognition using local invariant features for robotic applications: A survey. Pattern Recogn. 60, 499–514 (2016)CrossRefGoogle Scholar
  12. 12.
    Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Found. Trends Comput. Graph. Vis. 3(3), 177–280 (2008)CrossRefGoogle Scholar
  13. 13.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  14. 14.
    Guo, Z., Zhang, D.: A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 19(6), 1657–1663 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, Vol. 1, pp. 886–893 (2005)Google Scholar
  16. 16.
    Chu, C.T., Hwang, J.N., Pai, H.I., Lan, K.M.: Tracking human under occlusion based on adaptive multiple kernels with projected gradients. IEEE Trans. Multimed. 15(7), 1602–1615 (2013)CrossRefGoogle Scholar
  17. 17.
    Zhou, X., Li, Y., He, B.: Tracking Humans in Mutual Occlusion based on Game Theory (2013)Google Scholar
  18. 18.
    Jeong, J.M., Yoon, T.S., Park, J.B.: Kalman filter based multiple objects detection-tracking algorithm robust to occlusion. In: 2014 Proceedings of the SICE Annual Conference (SICE), pp. 941–946 (2014)Google Scholar
  19. 19.
    Rahmatian, S., Safabakhsh, R.: Online Multiple People Tracking-By-Detection in Crowded Scenes. In: 2014 7Th International Symposium on Telecommunications (IST), pp. 337–342 (2014)Google Scholar
  20. 20.
    Suresh, S., Chitra, K., Deepack, P.: Patch Based Frame Work for Occlusion Detection in Multi Human Tracking. In: Circuits, Power and Computing Technologies (ICCPCT), pp. 1194–1196 (2013)Google Scholar
  21. 21.
    Li, Z., Tang, Q.L., Sang, N.: Improved mean shift algorithm for occlusion pedestrian tracking. Electron. Lett. 44(10), 622–623 (2008)CrossRefGoogle Scholar
  22. 22.
    Yan, J., Ling, Q., Zhang, Y., Li, F., Zhao, F.: A Novel Occlusion-Adaptive Multi-Object Tracking Method for Road Surveillance Applications. In: 2013 32Nd Chinese Control Conference (CCC), pp. 3547–3551 (2013)Google Scholar
  23. 23.
    Tang, S., Andriluka, M., Milan, A., Schindler, K., Roth, S., Schiele, B.: Learning People Detectors for Tracking in Crowded Scenes. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1049–1056 (2013)Google Scholar
  24. 24.
    Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. Int. J. Comput. Vis. 110(1), 58–69 (2014)CrossRefGoogle Scholar
  25. 25.
    Guan, Y., Chen, X., Yang, D., Wu, Y.: Multi-Person Tracking-By-Detection with Local Particle Filtering and Global Occlusion Handling. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2014)Google Scholar
  26. 26.
    Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)CrossRefzbMATHGoogle Scholar
  27. 27.
    Pairo, W., Ruiz-del-Solar, J., Verschae, R., Correa, M., Loncomilla, P.: Person Following by Mobile Robots: Analysis of Visual and Range Tracking Methods and Technologies. In: Robocup 2013: Robot World Cup XVII, pp. 231–243 (2014)Google Scholar
  28. 28.
    Ruiz-del-Solar, J., Correa, M., Verschae, R., Bernuy, F., Loncomilla, P., Mascaró, M., Riquelme, R., Smith, F.: Bender – A general-purpose social robot with human-robot interaction abilities. J. Hum.–Robot Interact. 1(2), 54–75 (2012)Google Scholar
  29. 29.
    Uijlings, R., van de Sande, A., Gevers, T., Smeulders, M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154 (2013)CrossRefGoogle Scholar
  30. 30.
    Zitnick, C.L., Dollár, P.: Edge Boxes: Locating object proposals from edges. In: ECCV 2014, Lecture Notes in Computer Science of Computer Vision, vol. 8639, pp. 391–405 (2014)Google Scholar
  31. 31.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Visual object detection with deformable part models. Commun. ACM 56(9), 97–105 (2013)CrossRefGoogle Scholar
  32. 32.
    Dollár, P., Belongie, S.J., Perona, P.: The fastest pedestrian detector in the West. BMVC 2(3), 68.1–68.11 (2010)Google Scholar
  33. 33.
    Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)CrossRefGoogle Scholar
  34. 34.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)CrossRefGoogle Scholar
  35. 35.
    Pernici, F., Del Bimbo, A.: Object tracking by oversampling local features. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2538–2551 (2014)CrossRefGoogle Scholar
  36. 36.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet Classification with Deep Convolutional Neural Networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  37. 37.
    Girshick, R.: Fast R-CNN. arXiv:1504.08083 [cs.CV] (2015)
  38. 38.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: Advances in Neural Information Processing Systems (NIPS), Vol. 28 (2015)Google Scholar
  39. 39.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  40. 40.
    Chen, L., Zhou, F., Shen, Y., Tian, X., Ling, H., Chen, Y.: Illumination insensitive efficient Second-Order minimization for planar object tracking. ICRA (2017)Google Scholar
  41. 41.
    Tan, D.J., Ilic, S.: Multi-forest tracker: a chameleon in tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1202–1209 (2014)Google Scholar
  42. 42.
    Tan, D.J., Tombari, F., Ilic, S., Navab, N.: A versatile learning-based 3d temporal tracker: Scalable, robust, online. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 693–701 (2015)Google Scholar
  43. 43.
    Tan, D.J., Navab, N., Tombari, F.: Looking beyond the Simple Scenarios: Combining Learners and Optimizers in 3D Temporal Tracking. In: IEEE Transactions on Visualization and Computer Graphics (2017)Google Scholar
  44. 44.
    Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. arXiv:1612.08242 (2016)
  45. 45.
    Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  46. 46.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with Kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37, 583–596 (2015)CrossRefGoogle Scholar
  47. 47.
    Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.S.: Struck: Structured output tracking with Kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(110), 2096–2109 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Advanced Mining Technology CenterUniversidad de ChileSantiagoChile
  2. 2.Department of Electrical EngineeringUniversidad de ChileSantiagoChile

Personalised recommendations