Journal of Real-Time Image Processing

, Volume 14, Issue 3, pp 685–699 | Cite as

Toward a smart camera for fast high-level structure extraction

  • Roberto de Lima
  • Jose Martinez-Carranza
  • Alicia Morales-Reyes
  • Walterio Mayol-Cuevas
Special Issue Paper


This paper presents an initial framework to extract high-level structures from man-made environments, by means of a novel methodology that combines stereo vision, binary descriptors and parallel processing implemented on a GPU. High-level structures such as planes, spheres and cubes provide vital information of the world, essential to perform applications in the field of robotics, augmented reality and computer vision. However, their extraction involves several computational challenges, especially because their application context requires solving real-time and environment operation constraints. Hence, stereo vision-based attempts have been proposed, without achieving real-time performance because they require a rectification stage running in the frame-to-frame basis, increasing the computational burden. Therefore, in contrast to typical stereo algorithms, the proposed methodology is developed on the basis of a semi-calibrated stereo rig, which means that rectification stage is avoiding, thus enabling to invest computational cost in critical stages and consequently achieving a frame rate up to 50 fps for the whole process.


Real-time plane extraction Smart camera GPU Binary descriptors Semi-calibrated stereo 



The first author is supported by the Mexican National Council for Science and Technology (CONACyT) studentship number 627047. The second author is thankful for the support received through his Royal Society-Newton Advanced Fellowship with reference NA140454.


  1. 1.
    Alahi, A., Ortiz, R., Vandergheynst, Freak, P.: Fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517. IEEE, (2012)Google Scholar
  2. 2.
    Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In: European Conference on Computer Vision, pp. 214–227. Springer, Berlin (2012)Google Scholar
  3. 3.
    Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)Google Scholar
  4. 4.
    Authors, V.: Guidance stereo camera. url (2017)Google Scholar
  5. 5.
    Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)zbMATHCrossRefGoogle Scholar
  6. 6.
    Banz, C., Hesselbarth, S., Flatt, H., Blume, H., Pirsch, P.: Real-time stereo vision system using semi-global matching disparity estimation: Architecture and fpga-implementation. In: 2010 International Conference on Embedded Computer Systems (SAMOS), pp. 93–101. IEEE, (2010)Google Scholar
  7. 7.
    Baumberg, A.: Reliable feature matching across widely separated views. In: IEEE Conference on Computer Vision and Pattern Recognition, 2000. Proceedings. Vol. 1, pp. 774–781. IEEE, (2000)Google Scholar
  8. 8.
    Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer, Berlin (2006)Google Scholar
  9. 9.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)CrossRefGoogle Scholar
  10. 10.
    Borrmann, D., Elseberg, J., Lingemann, K., Nüchter, A.: The 3d Hough transform for plane detection in point clouds: A review and a new accumulator design. 3D. Research 2(2), 1–13 (2011)Google Scholar
  11. 11.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)Google Scholar
  12. 12.
    Carraro, M., Munaro, M., Menegatti, E.: Cost-efficient rgb-d smart camera for people detection and tracking. J. Electron. Imaging 25(4), 041007–041007 (2016)CrossRefGoogle Scholar
  13. 13.
    Corporation, C.N.: CUDA Developer Zone., (2016). Accessed October-7-2016
  14. 14.
    Crozier, S., Falconer, D., Mahmoud, S.: Least sum of squared errors (lsse) channel estimation. IEEE Proc. F-Radar Signal Process. 138, 371–378 (1991). IETCrossRefGoogle Scholar
  15. 15.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) Vol. 1, pp. 886–893. IEEE, (2005)Google Scholar
  16. 16.
    Denker, K., Umlauf, G.: Accurate real-time multi-camera stereo-matching on the gpu for 3d reconstruction. J. WSCG. 19(1–3), 9–16 (2011)Google Scholar
  17. 17.
    Derpanis, K.G.: Overview of the ransac algorithm. Image Rochester NY 4(1), 2–3 (2010)Google Scholar
  18. 18.
    Faugeras, O.: Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT press, Cambridge (1993)Google Scholar
  19. 19.
    Hartley, R., Gupta, R., Chang T.: Stereo from uncalibrated cameras. In: 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., pp. 761–764. IEEE, (1992)Google Scholar
  20. 20.
    Hartley, R.I.: In defense of the eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 19(6), 580–593 (1997)CrossRefGoogle Scholar
  21. 21.
    Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Experimental Robotics, pp. 477–491. Springer, Berlin (2014)Google Scholar
  22. 22.
    Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, 807–814. IEEE, (2005)Google Scholar
  23. 23.
    Jain, R., Kasturi, R., Schunck, B.G.: Machine Vision. McGraw-Hill, New York (1995)Google Scholar
  24. 24.
    Jung, I.-L., Sim, J.-Y., Kim, C.-S., Lee, S.-U.: Robust stereo matching under radiometric variations based on cumulative distributions of gradients. In: 2013 IEEE International Conference on Image Processing, pp. 2082–2085. IEEE, (2013)Google Scholar
  25. 25.
    Kowalczuk, J., Psota, E.T., Perez, L.C.: Real-time stereo matching on cuda using an iterative refinement method for adaptive support-weight correspondences. IEEE Trans. Circuit Syst. Video Technol. 23(1), 94–104 (2013)CrossRefGoogle Scholar
  26. 26.
    Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: Binary robust invariant scalable keypoints. In: 2011 International Conference on Computer Vision, pp. 2548–2555. IEEE, (2011)Google Scholar
  27. 27.
    Lewis, J.P.: Fast normalized cross-correlation. In: Vision Interface, Vol. 10, No. 1, pp. 120–123 (1995)Google Scholar
  28. 28.
    Loghman, M., Zarshenas, A., Chung, K.-H. Lee, Y., Kim, J.: A novel depth estimation method for uncalibrated stereo images. In: 2014 International SoC Design Conference (ISOCC), pp. 186–187. IEEE, (2014)Google Scholar
  29. 29.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  30. 30.
    Madeo, S., Pelliccia, R., Salvadori, C., del Rincon, J.M., Nebel, J.-C.: An optimized stereo vision implementation for embedded systems: application to rgb and infra-red images. J. Real-Time Image Process. 12(4), 725–746 (2016)CrossRefGoogle Scholar
  31. 31.
    Martin J., Crowley, J.L.: Comparison of correlation techniques. In: International Conference on Intelligent Autonomous Systems, Karlsruhe (Germany), pp. 86–93, (1995)Google Scholar
  32. 32.
    Marzollo, A.: Topics in Artificial Intelligence, vol. 256. Springer, Berlin (1976)zbMATHCrossRefGoogle Scholar
  33. 33.
    Mesmakhosroshahi, M., Chung, K.-H. Lee, Y., Kim, J.: Depth gradient based region of interest generation for pedestrian detection. In: 2014 International SoC Design Conference (ISOCC), pp. 156–157. IEEE, (2014)Google Scholar
  34. 34.
    Michailidis, G.-T., Pajarola, R., Andreadis, I.: High performance stereo system for dense 3-D reconstruction. IEEE Trans. Circuit Syst. Video Technol. 24(6), 929–941 (2014)CrossRefGoogle Scholar
  35. 35.
    Microsoft. Meet Kinect. (2016). Accessed October-5-2016
  36. 36.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  37. 37.
    Miksik. O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2681–2684. IEEE, (2012)Google Scholar
  38. 38.
    Mosqueron, R., Dubois, J., Mattavelli, M., Mauvilet, D.: Smart camera based on embedded hw/sw coprocessor. EURASIP J. Embedded Syst. 2008, 3 (2008)Google Scholar
  39. 39.
    Mukhopadhyay, P., Chaudhuri, B.B.: A survey of hough transform. Pattern Recogn. 48(3), 993–1010 (2015)CrossRefGoogle Scholar
  40. 40.
    Peak, V.: Motion capture systems. Vicon (2005). Accessed 11 July 2017
  41. 41.
    Rosin, P.L.: Measuring corner properties. Comput. Vis. Image Underst. 73(2), 291–307 (1999)CrossRefGoogle Scholar
  42. 42.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE, (2011)Google Scholar
  43. 43.
    Salas-Moreno, R.F., Glocken, B., Kelly, P.H., Davison, A.J.: Dense planar slam. In: 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164. IEEE, (2014)Google Scholar
  44. 44.
    Schnabel, R., Wahl, R., Klein R.: Efficient ransac for point-cloud shape detection. In: Computer Graphics Forum, Vol. 26, pp. 214–226. Wiley Online Library, (2007)Google Scholar
  45. 45.
    Senouci, B., Charfi, I., Heyrman, B., Dubois, J., Miteran, J.: Fast prototyping of a soc-based smart-camera: a real-time fall detection case study. J. Real-Time Image Process. 12(4), 649–662 (2016)CrossRefGoogle Scholar
  46. 46.
    Tan, X., Sun, C., Sirault, X., Furbank, R., Pham, T.D.: Feature matching in stereo images encouraging uniform spatial distribution. Pattern Recogn. 48(8), 2530–2542 (2015)CrossRefGoogle Scholar
  47. 47.
    Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P., et al.: Hough-transform and extended ransac algorithms for automatic detection of 3d building roof planes from lidar data. Proc. ISPRS Workshop Laser Scan. 36, 407–412 (2007)Google Scholar
  48. 48.
    Tippetts, B., Lee, D.J., Lillywhite, K., Archibald, J.: Review of stereo vision algorithms and their suitability for resource-limited systems. J. Real-Time Image Process. 11(1), 5–25 (2016)CrossRefGoogle Scholar
  49. 49.
    Tordoff, B., Murray, D.W.: The impact of radial distortion on the self-calibration of rotating cameras. Comput. Vis. Image Underst. 96(1), 17–34 (2004)CrossRefGoogle Scholar
  50. 50.
    Veksler O.: Fast variable window for stereo correspondence using integral images. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings. Vol. 1, pp. I–556. IEEE, (2003)Google Scholar
  51. 51.
    Vision O.S.C.: OpenCV., 2016. Accessed October-10-2016
  52. 52.
    Wang, Q., Wu, J., Long, C., Li, B.: P-fad: Real-time face detection scheme on embedded smart cameras. IEEE J. Emerg. Select. Topics Circuit Syst. 3(2), 210–222 (2013)CrossRefGoogle Scholar
  53. 53.
    Weingarten, J., Siegwart, R.: 3d slam using planar segments. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3062–3067. IEEE, (2006)Google Scholar
  54. 54.
    Wolf, W., Ozer, B., Lv, T.: Smart cameras as embedded systems. Computer 35(9), 48–53 (2002)CrossRefGoogle Scholar
  55. 55.
  56. 56.
    Yoon, K.-J., Kweon, I.-S.: Locally adaptive support-weight approach for visual correspondence search. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, pp. 924–931. IEEE, (2005)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Optics and ElectronicsNational Institute of AstrophysicsPueblaMexico
  2. 2.University of BristolBristolUK

Personalised recommendations