Advertisement

International Journal of Computer Vision

, Volume 117, Issue 1, pp 48–69 | Cite as

Online Approximate Model Representation Based on Scale-Normalized and Fronto-Parallel Appearance

  • Kiho KwakEmail author
  • Jun-Sik Kim
  • Daniel F. Huber
  • Takeo Kanade
Article

Abstract

Various object representations have been widely used for many tasks such as object detection, recognition, and tracking. Most of them requires an intensive training process on large database which is collected in advance, and it is hard to add models of a previously unobserved object which is not in the database. In this paper, we investigate how to create a representation of a new and unknown object online, and how to apply it to practical applications like object detection and tracking. To make it viable, we utilize a sensor fusion approach using a camera and a single-line scan LIDAR. The proposed representation consists of an approximated geometry model and a viewpoint-scale invariant appearance model which makes to extremely simple to match the model and the observation. This property makes it possible to model a new object online, and provides a robustness to viewpoint variation and occlusion. The representation has benefits of both an implicit model (referred to as a view-based model) and an explicit model (referred to as a shape-based model). Intensive experiments using synthetic and real data demonstrate the viability of the proposed object representation in both modeling and detecting/tracking objects.

Keywords

Object modeling Approximate model representation Sensor fusion Object detection and tracking 

Supplementary material

Supplementary material 1 (wmv 6346 KB)

References

  1. Bertalmio, M., Sapiro, G., & Randall, G. (2000). Morphing active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7), 733–737.CrossRefGoogle Scholar
  2. Bouguet, J. Y. (2008). Camera calibration toolbox for Matlab. http://vision.caltech.edu/bouguetj/calib_doc/download/index.html.
  3. Boykov, Y. Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In: Proceedings of the international conference on computer vision, (vol.1, pp. 105–112). IEEE Computer Society.Google Scholar
  4. Cannons, K. (2008). A review of visual tracking. Technical report, York University.Google Scholar
  5. Collins, R. (2003). Mean-shift blob tracking through scale space. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–234–40). IEEE.Google Scholar
  6. Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5), 564–575.CrossRefGoogle Scholar
  7. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (pp. 886–893).Google Scholar
  8. Dowson, N., & Bowden, R. (2005). Simultaneous modeling and tracking (smat) of feature sets. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. 99–105). IEEE.Google Scholar
  9. Duda, R., & Hart, P. (1973). Pattern classification and scene analysis. New York: Wiley.zbMATHGoogle Scholar
  10. Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29, 1707–1725.CrossRefGoogle Scholar
  11. Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.CrossRefGoogle Scholar
  12. Franc, V., & Hlavac, V. (2004). Statistical pattern recognition toolbox for matlab. Prague: Center for Machine Perception, Czech Technical University.Google Scholar
  13. Haag, M., & Nagel, H. H. (1999). Combination of edge element and optical flow estimates for 3D-model-based vehicle tracking in traffic image sequences. International Journal of Computer Vision, 35, 295–319.CrossRefGoogle Scholar
  14. Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  15. Hinterstoisser, S., Cagniart, C., Ilic, S., Sturm, P., Navab, N., Fua, P., et al. (2012). Gradient response maps for real-time detection of textureless objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 876–888.CrossRefGoogle Scholar
  16. Horn, B. K. P., & Schunck, B. G. (1981). Determining optical flow. Artificial Intelligence, 17, 185–203.CrossRefGoogle Scholar
  17. Jepson, A., Fleet, D., & El-Maraghi, T. (2003). Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1296–1311.CrossRefGoogle Scholar
  18. Kasper, A., Xue, Z., & Dillmann, R. (2012). The kit object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research, 31(8), 927–934.Google Scholar
  19. Koller, D., Danilidis, K., & Nagel, H. H. (1993). Model-based object tracking in monocular image sequences of road traffic scenes. International Journal of Computer Vision, 10, 257–281.CrossRefGoogle Scholar
  20. Kwak, K., Huber, D., Chae, J., & Kanade, T. (2010). Boundary detection based on supervised learning. In: Proceedings of the IEEE international conference on robotics and automation. IEEE.Google Scholar
  21. Kwak, K., Huber, D., Badino, H., & Kanade, T. (2011). Extrinsic calibration of a single line scanning lidar and a camera. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE.Google Scholar
  22. Kwak, K., Kim, J. S., Min, J., & Park, Y. W. (2014). Unknown multiple object tracking using 2d lidar and video camera. Electronics Letters, 50(8), 600–602.CrossRefGoogle Scholar
  23. Leibe, B., Schindler, K., Cornelis, N., & Van Gool, L. (2008). Coupled object detection and tracking from static cameras and moving vehicles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1683–1698.CrossRefGoogle Scholar
  24. Lempitsky, V. S., & Ivanov, D. V. (2007). Seamless mosaicing of image-based texture maps. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition. IEEE Computer Society.Google Scholar
  25. Lepetit, V., & Fua, P. (2005). Monocular model-based 3D tracking of rigid objects. Foundations and Trends in Computer Graphics and Vision, 1, 1–89.CrossRefGoogle Scholar
  26. Li, Y., Gu, L., & Kanade, T. (2011). Robustly aligning a shape model and its application to car alignment of unknown pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(9), 1860–1876.CrossRefGoogle Scholar
  27. Lou, J., Tan, T., Hu, W., Yang, H., & Maybank, S. J. (2005). 3-D model-based vehicle tracking. IEEE Transactions on Image Processing, 14, 1561–1569.CrossRefGoogle Scholar
  28. Luber, M., Arras, K. O., Plagemann, C., & Burgard, W. (2009). Classifying dynamic objects. Autonomous Robots, 26, 141–151.CrossRefGoogle Scholar
  29. Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In: Proceedings of the international joint conference on artificial intelligence, (pp. 674–679).Google Scholar
  30. MacLachlan, R. (2005). Tracking moving objects from a moving vehicle using a laser scanner. Technical Report CMU-RI-TR-05-07, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
  31. Moreels, P., & Perona, P. (2007). Evaluation of features detectors and descriptors based on 3D objects. International Journal of Computer Vision, 73, 263–284.CrossRefGoogle Scholar
  32. Mundy, J. (2006). Object recognition in the geometric era: a retrospective. In J. Ponce, M. Hebert, C. Schmid, & A. Zisserman (Eds.), Toward category-level object recognition. Lecture Notes in Computer Science (vol. 4170, pp. 3–28). Berlin: Springer.Google Scholar
  33. Nguyen, V., Gächter, S., Martinelli, A., Tomatis, N., & Siegwart, R. (2007). A comparison of line extraction algorithms using 2d range data for indoor mobile robotics. Autonomous Robots, 23(2), 97–111.CrossRefGoogle Scholar
  34. Ottlik, A., & Nagel, H. H. (2008). Initialization of model-based vehicle tracking in video sequences of inner-city intersections. International Journal of Computer Vision, 80, 211–225.CrossRefGoogle Scholar
  35. Petrovskaya, A., & Thrun, S. (2009). Model based vehicle detection and tracking for autonomous urban driving. Autonomous Robots, 26, 123–139.CrossRefGoogle Scholar
  36. Premebida, C., Ludwig, O., & Nunes, U. (2009). Lidar and vision-based pedestrian detection system. Journal of Field Robotics, 26, 696–711.CrossRefGoogle Scholar
  37. Rav-Acha, A., Kohli, P., Rother, C., & Fitzgibbon, A. (2008). Unwrap mosaics: A new representation for video editing. In: ACM SIGGRAPH 2008 Conference Proceedings. ACM.Google Scholar
  38. Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2003) . 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–272–7). IEEE Computer Society.Google Scholar
  39. Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91(2), 200–215.CrossRefMathSciNetzbMATHGoogle Scholar
  40. Sato, Y., Wheeler, M. D., & Ikeuchi, K. (1997). Object shape and reflectance modeling from observation. In: Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH ’97, (pp. 379–387).Google Scholar
  41. Scharstein, D. (1994). Matching images by comparing their gradient fields. In: Proceedings of the international conference on pattern recognition, (pp. 572–575).Google Scholar
  42. Schneiderman, H., & Kanade, T. (2000). A statistical method for 3d object detection applied to faces and cars. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (vol. 1, pp. 746–751). IEEE.Google Scholar
  43. Shafique, K., & Shah, M. (2005). A noniterative greedy algorithm for multiframe point correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(1), 51–65.CrossRefGoogle Scholar
  44. Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., & Pollefeys, M. (2008). Interactive 3d architectural modeling from unordered photo collections. ACM Transactions on Graphics, 27(5), 159:1–159:10.CrossRefGoogle Scholar
  45. Szeliski, R. (2010). Computer vision: Algorithms and applications. New York: Springer.Google Scholar
  46. Terzopoulos, D., & Szeliski, R. (1993). Active vision. Cambridge, MA: MIT Press.Google Scholar
  47. Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–762–II–769). IEEE Computer Society.Google Scholar
  48. Veenman, C., Reinders, M., & Backer, E. (2001). Resolving motion correspondence for densely moving points. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(1), 54–72.CrossRefGoogle Scholar
  49. Xiang, Y., & Savarese, S. (2012). Estimating the aspect layout of object categories. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3410–3417). IEEE.Google Scholar
  50. Xiang, Y., Song, C., Mottaghi, R., & Savarese, S. (2014). Monocular multiview object tracking with 3D aspect parts. In: Proceedings of the computer vision–ECCV 2014, (pp. 220–235). Berlin: Springer.Google Scholar
  51. Yan, P., Khan, S.M., & Shah, M. (2007). 3D model based object class detection in an arbitrary view. In: Proceedings of the IEEE international conference on computer vision, (vol. 0, pp. 1–6). IEEE.Google Scholar
  52. Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A survey. ACM Computer Survey, 38(4), 13.CrossRefGoogle Scholar
  53. Yin, Z., & Collins, R. (2007). On-the-fly object modeling while tracking. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition (pp. 1 –8). IEEE Computer Society.Google Scholar
  54. Zia, M. Z., Stark, M., Schiele, B., & Schindler, K. (2013). Detailed 3D representations for object recognition and modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2608–2623.CrossRefGoogle Scholar
  55. Zia, M. Z., Stark, M., & Schindler, K. (2015). Towards scene understanding with detailed 3D object representations. International Journal of Computer Vision, 112(2), 188–203.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Kiho Kwak
    • 1
    Email author
  • Jun-Sik Kim
    • 2
  • Daniel F. Huber
    • 3
  • Takeo Kanade
    • 3
  1. 1.Agency for Defense DevelopmentDaejeonSouth Korea
  2. 2.Korea Institute of Science and TechnologySeoulSouth Korea
  3. 3.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations