Low-dimensional superpixel descriptor and its application in visual correspondence estimation
Abstract
Establishing local visual correspondence between video frames is an important and challenging problem in many vision based applications. Local keypoint detection and description based pixel-level matching is a typical way for visual correspondence estimation. Unlike traditional local keypoint descriptor based methods, this paper proposes a comprehensive yet low-dimensional local feature descriptor based on superpixels generated by over segmentation. The proposed local feature descriptor extracts shape feature, texture feature, and color feature from superpixels by orientated center-boundary distance (OCBD), gray-level co-occurrence matrix (GLCM), and saturation histogram (SHIST), respectively. The types of features are more comprehensive than existing descriptors which extract only one specific kind of feature. Experimental results on the widely used Middlebury optical flow dataset prove that the proposed superpixel descriptor achieves triple accuracy compared with the state-of-the-art ORB descriptor which has the same dimension of features with the proposed one. In addition, since the dimension of the proposed superpixel descriptor is low, it is convenient for matching and memory-efficient for hardware implementation.
Keywords
Superpixel descriptor Low-dimensional feature Visual correspondence estimationNotes
Acknowledgements
This work was supported by KAKENHI (16K13006) and Waseda University Grant for Special Research Projects (2017B-261).
References
- 1.Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282CrossRefGoogle Scholar
- 2.Alahi A, Ortiz R, Vandergheynst P (2012) FREAK: fast retina keypoint. In: Proceedings of the international conference on computer vision and pattern recognition, pp 510–517Google Scholar
- 3.Awad AI, Hassaballah M (2016) Image feature detectors and descriptors. Springer International Publishing, ChamCrossRefGoogle Scholar
- 4.Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31CrossRefGoogle Scholar
- 5.Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: Proceedings of the European conference on computer vision, pp 404–417Google Scholar
- 6.Beaudet P (1978) Rotationally invariant image operators. In: Proceedings of the international conference on pattern recognition, pp 579–583Google Scholar
- 7.Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: binary robust independent elementary features. In: Proceedings of the European conference on computer vision, pp 778–792Google Scholar
- 8.Chen J, Li Z, Huang B (2017) Linear spectral clustering superpixel. IEEE Trans Image Process 26(7):3317–3330MathSciNetCrossRefGoogle Scholar
- 9.Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619CrossRefGoogle Scholar
- 10.Daribo I, Florencio D, Cheung G (2014) Arbitrarily shaped motion prediction for depth video compression using arithmetic edge coding. IEEE Trans Image Process 23(11):4696–4708MathSciNetzbMATHCrossRefGoogle Scholar
- 11.Du S, Ikenaga T (2018) Low-dimensional superpixel descriptor for visual correspondence estimation in video. In: Proceedings of the international symposium on intelligent signal processing and communication systems, pp 287–291Google Scholar
- 12.Fan B, Wang Z, Wu F (2015) Local image descriptor: modern approaches. Springer, BerlinzbMATHCrossRefGoogle Scholar
- 13.Felzenszwalb P, Huttenlocher D (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRefGoogle Scholar
- 14.Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395MathSciNetCrossRefGoogle Scholar
- 15.Guo Y, Zeng H, Mu Z-C, Zhang F (2010) Rotation-invariant DAISY descriptor for keypoint matching and its application in 3D reconstruction. In: Proceedings of the international conference on signal processing, pp 1198–1201Google Scholar
- 16.Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621CrossRefGoogle Scholar
- 17.Harris C, Stephens M (1988) A combined coer and edge detector. In: Proceedings of the Alvey vision conference, pp 147–151Google Scholar
- 18.Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17 (1–3):185–203CrossRefGoogle Scholar
- 19.Hu W, Li W, Zhang X, Maybank S (2015) Single and multiple object tracking using a multi-feature joint sparse representation. IEEE Trans Pattern Anal Mach Intell 37(4):816–833CrossRefGoogle Scholar
- 20.Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the international conference on computer vision and pattern recognition, pp 506–513Google Scholar
- 21.Khan N, McCane B, Mills S (2015) Better than SIFT? Mach Vision Appl 26(6):819–836CrossRefGoogle Scholar
- 22.Leutenegger S, Chli M, Siegwart R Y (2011) BRISK: binary robust invariant scalable keypoints. In: Proceedings of the international conference computer vision, pp 2548–2555Google Scholar
- 23.Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31(12):2290–2297CrossRefGoogle Scholar
- 24.Liu C, Yuen J, Torralba A (2011) SIFT flow: dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994CrossRefGoogle Scholar
- 25.Liu Y, Nie L, Han L, Zhang L, Rosenblum D S (2015) Action2Activity: recognizing complex activities from sensor data. In: Proceedings of the international conference on artificial intelligence, pp 1617–1623Google Scholar
- 26.Liu L, Cheng L, Liu Y, Jia Y, Rosenblum D S (2016) Recognizing complex activities by a probabilistic interval-based model. In: Proceedings of the AAAI conference on artificial intelligence, pp 1266–1272Google Scholar
- 27.Liu Y, Nie L, Liu L, Rosenblum D S (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115CrossRefGoogle Scholar
- 28.Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110MathSciNetCrossRefGoogle Scholar
- 29.Miao Z, Jiang X (2013) Interest point detection using rank order LoG filter. Pattern Recognit 46:2890–2901CrossRefGoogle Scholar
- 30.Po L-M, Ma W-C (1996) A novel four-step search algorithm for fast block motion estimation. IEEE Trans Circuits Syst Video Technol 6(3):313–317CrossRefGoogle Scholar
- 31.Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Proceedings of the European conference on computer vision, pp 430–443Google Scholar
- 32.Rosten E, Porter R, Drummond T (2010) Faster and better: a machine learning approach to corner detection. IEEE Trans Pattern Anal Mach Intell 32(1):105–119CrossRefGoogle Scholar
- 33.Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: Proceedings of the international conference computer vision, pp 2564–2571Google Scholar
- 34.Schwartz WR, Pedrini H (2006) Textured image segmentation based on spatial dependence using a Markov random field model. In: Proceedings of the international conference on image processing, pp 2449–2452Google Scholar
- 35.Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRefGoogle Scholar
- 36.Smith SM, Brady JM (1997) SUSAN: a new approach to low level image processing. Int J Comput Vis 23(1):45–78CrossRefGoogle Scholar
- 37.Soh L-K, Tsatsoulis C (1999) Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Trans Geosci Remote Sens 37(2):780–795CrossRefGoogle Scholar
- 38.Yang P, Yang G (2016) Feature extraction using dual-tree complex wavelet transform and gray level co-occurrence matrix. Neurocomputing 197:212–220CrossRefGoogle Scholar