Multimedia Tools and Applications

, Volume 78, Issue 14, pp 19457–19472 | Cite as

Low-dimensional superpixel descriptor and its application in visual correspondence estimation

  • Songlin DuEmail author
  • Takeshi Ikenaga


Establishing local visual correspondence between video frames is an important and challenging problem in many vision based applications. Local keypoint detection and description based pixel-level matching is a typical way for visual correspondence estimation. Unlike traditional local keypoint descriptor based methods, this paper proposes a comprehensive yet low-dimensional local feature descriptor based on superpixels generated by over segmentation. The proposed local feature descriptor extracts shape feature, texture feature, and color feature from superpixels by orientated center-boundary distance (OCBD), gray-level co-occurrence matrix (GLCM), and saturation histogram (SHIST), respectively. The types of features are more comprehensive than existing descriptors which extract only one specific kind of feature. Experimental results on the widely used Middlebury optical flow dataset prove that the proposed superpixel descriptor achieves triple accuracy compared with the state-of-the-art ORB descriptor which has the same dimension of features with the proposed one. In addition, since the dimension of the proposed superpixel descriptor is low, it is convenient for matching and memory-efficient for hardware implementation.


Superpixel descriptor Low-dimensional feature Visual correspondence estimation 



This work was supported by KAKENHI (16K13006) and Waseda University Grant for Special Research Projects (2017B-261).


  1. 1.
    Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282CrossRefGoogle Scholar
  2. 2.
    Alahi A, Ortiz R, Vandergheynst P (2012) FREAK: fast retina keypoint. In: Proceedings of the international conference on computer vision and pattern recognition, pp 510–517Google Scholar
  3. 3.
    Awad AI, Hassaballah M (2016) Image feature detectors and descriptors. Springer International Publishing, ChamCrossRefGoogle Scholar
  4. 4.
    Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31CrossRefGoogle Scholar
  5. 5.
    Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: Proceedings of the European conference on computer vision, pp 404–417Google Scholar
  6. 6.
    Beaudet P (1978) Rotationally invariant image operators. In: Proceedings of the international conference on pattern recognition, pp 579–583Google Scholar
  7. 7.
    Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: binary robust independent elementary features. In: Proceedings of the European conference on computer vision, pp 778–792Google Scholar
  8. 8.
    Chen J, Li Z, Huang B (2017) Linear spectral clustering superpixel. IEEE Trans Image Process 26(7):3317–3330MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619CrossRefGoogle Scholar
  10. 10.
    Daribo I, Florencio D, Cheung G (2014) Arbitrarily shaped motion prediction for depth video compression using arithmetic edge coding. IEEE Trans Image Process 23(11):4696–4708MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Du S, Ikenaga T (2018) Low-dimensional superpixel descriptor for visual correspondence estimation in video. In: Proceedings of the international symposium on intelligent signal processing and communication systems, pp 287–291Google Scholar
  12. 12.
    Fan B, Wang Z, Wu F (2015) Local image descriptor: modern approaches. Springer, BerlinzbMATHCrossRefGoogle Scholar
  13. 13.
    Felzenszwalb P, Huttenlocher D (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRefGoogle Scholar
  14. 14.
    Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395MathSciNetCrossRefGoogle Scholar
  15. 15.
    Guo Y, Zeng H, Mu Z-C, Zhang F (2010) Rotation-invariant DAISY descriptor for keypoint matching and its application in 3D reconstruction. In: Proceedings of the international conference on signal processing, pp 1198–1201Google Scholar
  16. 16.
    Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621CrossRefGoogle Scholar
  17. 17.
    Harris C, Stephens M (1988) A combined coer and edge detector. In: Proceedings of the Alvey vision conference, pp 147–151Google Scholar
  18. 18.
    Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17 (1–3):185–203CrossRefGoogle Scholar
  19. 19.
    Hu W, Li W, Zhang X, Maybank S (2015) Single and multiple object tracking using a multi-feature joint sparse representation. IEEE Trans Pattern Anal Mach Intell 37(4):816–833CrossRefGoogle Scholar
  20. 20.
    Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the international conference on computer vision and pattern recognition, pp 506–513Google Scholar
  21. 21.
    Khan N, McCane B, Mills S (2015) Better than SIFT? Mach Vision Appl 26(6):819–836CrossRefGoogle Scholar
  22. 22.
    Leutenegger S, Chli M, Siegwart R Y (2011) BRISK: binary robust invariant scalable keypoints. In: Proceedings of the international conference computer vision, pp 2548–2555Google Scholar
  23. 23.
    Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31(12):2290–2297CrossRefGoogle Scholar
  24. 24.
    Liu C, Yuen J, Torralba A (2011) SIFT flow: dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994CrossRefGoogle Scholar
  25. 25.
    Liu Y, Nie L, Han L, Zhang L, Rosenblum D S (2015) Action2Activity: recognizing complex activities from sensor data. In: Proceedings of the international conference on artificial intelligence, pp 1617–1623Google Scholar
  26. 26.
    Liu L, Cheng L, Liu Y, Jia Y, Rosenblum D S (2016) Recognizing complex activities by a probabilistic interval-based model. In: Proceedings of the AAAI conference on artificial intelligence, pp 1266–1272Google Scholar
  27. 27.
    Liu Y, Nie L, Liu L, Rosenblum D S (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115CrossRefGoogle Scholar
  28. 28.
    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  29. 29.
    Miao Z, Jiang X (2013) Interest point detection using rank order LoG filter. Pattern Recognit 46:2890–2901CrossRefGoogle Scholar
  30. 30.
    Po L-M, Ma W-C (1996) A novel four-step search algorithm for fast block motion estimation. IEEE Trans Circuits Syst Video Technol 6(3):313–317CrossRefGoogle Scholar
  31. 31.
    Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Proceedings of the European conference on computer vision, pp 430–443Google Scholar
  32. 32.
    Rosten E, Porter R, Drummond T (2010) Faster and better: a machine learning approach to corner detection. IEEE Trans Pattern Anal Mach Intell 32(1):105–119CrossRefGoogle Scholar
  33. 33.
    Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: Proceedings of the international conference computer vision, pp 2564–2571Google Scholar
  34. 34.
    Schwartz WR, Pedrini H (2006) Textured image segmentation based on spatial dependence using a Markov random field model. In: Proceedings of the international conference on image processing, pp 2449–2452Google Scholar
  35. 35.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRefGoogle Scholar
  36. 36.
    Smith SM, Brady JM (1997) SUSAN: a new approach to low level image processing. Int J Comput Vis 23(1):45–78CrossRefGoogle Scholar
  37. 37.
    Soh L-K, Tsatsoulis C (1999) Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Trans Geosci Remote Sens 37(2):780–795CrossRefGoogle Scholar
  38. 38.
    Yang P, Yang G (2016) Feature extraction using dual-tree complex wavelet transform and gray level co-occurrence matrix. Neurocomputing 197:212–220CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Graduate School of Information, Production and SystemsWaseda UniversityKitakyushuJapan

Personalised recommendations