Abstract
Direct camera tracking is a popular tool for motion estimation. It promises more precise estimates, enhanced robustness as well as denser reconstruction efficiently. However, most direct tracking algorithms rely on the brightness constancy assumption, which is seldom satisfied in the real world. This means that direct tracking is unsuitable when dealing with sudden and arbitrary illumination changes. In this work, we propose a non-parametric approach to address illumination variations in direct tracking. Instead of modeling illumination, or relying on difficult to optimize robust similarity metrics, we propose to directly minimize the squared distance between densely evaluated local feature descriptors. Our approach is shown to perform well in terms of robustness and runtime. The algorithm is evaluated on two direct tracking problems: template tracking and direct visual odometry and using a variety of feature descriptors proposed in the literature.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kerl, C., Sturm, J., Cremers, D.: Dense visual slam for RGB-D cameras. In: Proceedings of the International Conference on Intelligent Robots and Systems (2013)
Comport, A.I., Malis, E., Rives, P.: Real-time quadrifocal visual odometry. Int. J. Robot. Res. 29, 245–266 (2010)
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10605-2_54
Handa, A., Newcombe, R.A., Angeli, A., Davison, A.J.: Real-time camera tracking: when is high frame-rate best? In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 222–235. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33786-4_17
Salas-Moreno, R., Glocken, B., Kelly, P., Davison, A.: Dense planar SLAM. In: 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164 (2014)
Newcombe, R., Lovegrove, S., Davison, A.: DTAM: Dense tracking and mapping in real-time. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2320–2327 (2011)
Irani, M., Anandan, P.: About direct methods. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 267–277. Springer, Heidelberg (2000). doi:10.1007/3-540-44480-7_18
Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2014)
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision (DARPA). In: Proceedings of the 1981 DARPA Image Understanding Workshop, pp. 121–130 (1981)
Bartoli, A.: Groupwise geometric and photometric direct image registration. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2098–2108 (2008)
Evangelidis, G.D., Psarakis, E.Z.: Parametric image alignment using enhanced correlation coefficient maximization. PAMI 30, 1858–1865 (2008)
Dowson, N., Bowden, R.: Mutual information for Lucas-Kanade tracking (MILK): an inverse compositional formulation. PAMI 30, 180–185 (2008)
Müller, T., Rabe, C., Rannacher, J., Franke, U., Mester, R.: Illumination-robust dense optical flow using census signatures. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 236–245. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23123-0_24
Black, M., Anandan, P.: A framework for the robust estimation of optical flow. In: 1993 Proceedings of the Fourth International Conference on Computer Vision, pp. 231–236 (1993)
Irani, M., Anandan, P.: Robust multi-sensor image alignment. In: 1998 Sixth International Conference on Computer Vision, pp. 959–966 (1998)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
Baker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. Int. J. Comput. Vision 56, 221–255 (2004)
Klose, S., Heise, P., Knoll, A.: Efficient compositional approaches for real-time robust direct visual odometry from RGB-D data. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2013)
Zia, M.Z., Nardi, L., Jack, A., Vespa, E., Bodin, B., Kelly, P.H.J., Davison, A.J.: Comparative design space exploration of dense and semi-dense SLAM. CoRR abs/1509.04648 (2015)
Sun, D., Roth, S., Black, M.: Secrets of optical flow estimation and their principles. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2432–2439 (2010)
Vogel, C., Roth, S., Schindler, K.: An evaluation of data costs for optical flow. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 343–353. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40602-7_37
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1615–1630 (2005)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Torr, P.H.S., Zisserman, A.: Feature based methods for structure and motion estimation. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 278–294. Springer, Heidelberg (2000). doi:10.1007/3-540-44480-7_19
Furukawa, Y., Hernndez, C.: Multi-view stereo: a tutorial. Found. Trends Comput. Graph. Vis. 9, 1–148 (2015)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)
Antonakos, E., Alabort-i Medina, J., Tzimiropoulos, G., Zafeiriou, S.: Feature-based Lucas-Kanade and active appearance models. IEEE Trans. Image Process. 24, 2617–2632 (2015)
Bristow, H., Lucey, S.: Regression-based image alignment for general object categories. CoRR abs/1407.1957 (2014)
Sevilla-Lara, L., Sun, D., Learned-Miller, E.G., Black, M.J.: Optical flow estimation with channel constancy. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 423–438. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_28
Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Bristow, H., Lucey, S.: In defense of gradient-based alignment on densely sampled sparse features. In: Hassner, T., Liu, C. (eds.) Dense Image Correspondences for Computer Vision, pp. 135–152. Springer, Cham (2016). doi:10.1007/978-3-319-23048-1_7
Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33, 978–994 (2011)
Crivellaro, A., Lepetit, V.: Robust 3D tracking with descriptor fields. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24673-2_3
Alismail, H., Browning, B., Lucey, S.: Bit-Planes: Dense Subpixel Alignment of Binary Descriptors. CoRR abs/1602.00307 (2016)
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994). doi:10.1007/BFb0028345
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_56
Murray, R.M., Li, Z., Sastry, S.S., Sastry, S.S.: A Mathematical Introduction to Robotic Manipulation. CRC Press, Boca Raton (1994)
Zhang, Z.: Parameter estimation techniques: a tutorial with application to conic fitting. Image Vis. Comput. 15, 59–76 (1997)
Engel, J., Stueckler, J., Cremers, D.: Large-scale direct SLAM with stereo cameras. In: International Conference on Intelligent Robots and Systems (IROS) (2015)
Peris, M., Maki, A., Martull, S., Ohkawa, Y., Fukui, K.: Towards a simulation driven stereo vision system. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 1038–1042 (2012)
Martull, S., Peris, M., Fukui, K.: Realistic CG stereo image dataset with ground truth disparity maps. In: ICPR workshop TrakMark 2012, vol. 111, pp. 117–118 (2012)
Huang, A.S., Bachrach, A., Henry, P., Krainin, M., Maturana, D., Fox, D., Roy, N.: Visual odometry and mapping for autonomous flight using an RGB-D camera. In: International Symposium on Robotics Research (ISRR), pp. 1–16 (2011)
Acknowledgement
We thank the anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alismail, H., Browning, B., Lucey, S. (2017). Enhancing Direct Camera Tracking with Dense Feature Descriptors. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-54190-7_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)