Advertisement

Multimedia Tools and Applications

, Volume 73, Issue 3, pp 1247–1268 | Cite as

Semantic superpixel extraction via a discriminative sparse representation

  • Yurui XieEmail author
  • Chao Huang
  • Linfeng Xu
Article

Abstract

The superpixel extraction algorithm is becoming increasingly significant for pattern recognition applications. Different superpixel generation methods have different properties and lead to various over-segmentation results. In this paper, we treat the over-segmentation as an image decomposition problem, and propose a novel discriminative sparse coding (DSC) algorithm to effectively extract the semantic superpixels. Specifically, the DSC algorithm incorporates a new discriminative regularization term in the traditional sparse representation model. Then the new regularization term is combined with the reconstruction error and sparse constraint to form a unified objective function. The extracted superpixels not only respect the local image boundaries, but also are dissimilar between each other. Meanwhile, the quantity of segments is sparse. These properties benefit for the semantic superpixel extraction. The final refined superpixels are generated based on an effective Bayesian-classification criterion in a post-processing step. Experimental results show that the over-segmentation quality of DSC algorithm outperforms the state of the art methods.

Keywords

Superpixel Sparse representation Multi-scale 

Notes

Acknowledgements

This work was partially supported by NSFC (No. 61179060, and 61101091), National High Technology Research and Development Program of China (863 Program, No. 2012AA011503), and Fundamental Research Funds for the Central Universities (ZYGX2012J019).

References

  1. 1.
    Amri S, Barhoumi W, Zagrouba E (2010) A robust framework for joint background/foreground segmentation of complex video scenes filmed with freely moving camera. Multimed Tools Appl 46:175–205CrossRefGoogle Scholar
  2. 2.
    Ayvaci A, Soatto S (2009) Motion segmentation with occlusions on the superpixel graph. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 727–734Google Scholar
  3. 3.
    Chen Y, Chan A, Wang G (2012) Adaptive figure-ground classification. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 654–661Google Scholar
  4. 4.
    Dong W, Zhang D, Shi G (2011) Centralized sparse representation for image restoration. In: IEEE International Conference on Computer Vision, ICCV 2011, pp 1259–1266Google Scholar
  5. 5.
    Donoho DL (2004) For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution. In: Communications on pure and applied mathematics, pp 907–934Google Scholar
  6. 6.
    Elad M (2010) Sparse and redundant representations: from theory to appplications in signal and image processing. SpringerGoogle Scholar
  7. 7.
    Elad M, Figueiredo MAT, Ma Y (2010) On the role of sparse and redundant representations in image processing. Proc IEEE 98:972–982CrossRefGoogle Scholar
  8. 8.
    Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59:167–181CrossRefGoogle Scholar
  9. 9.
    Fragkiadaki K, Zhang G, Shi J (2012) Video segmentation by tracing discontinuities in a trajectory embedding. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 1846–1853Google Scholar
  10. 10.
    Fulkerson B, Vedaldi A, Soatto S (2009) Class segmentation and object localization with superpixel neighborhoods. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 670–677Google Scholar
  11. 11.
    Gkalelis N, Mezaris V, Kompatsiaris I, Stathaki T (2013) Mixture subclass discriminant analysis link to restricted gaussian model and other generalizations. IEEE Trans Neural Netw Learn Syst 24:8–21CrossRefGoogle Scholar
  12. 12.
    Golub GH, Hansen PC, O’Leary DP (1999) Tikhonov regularization and total least squares. SIAM J Matrix Anal Appl 21:185–194CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Huang K, Aviyente S (2006) Sparse representation for signal classiffication. In: Adv. NIPS, pp 609–616Google Scholar
  14. 14.
    Huang S, Lee Y, Bell G, Ou Z (2010) An efficient segmentation algorithm for captchas with line cluttering and character warping. Multimed Tools Appl 48:267–289CrossRefGoogle Scholar
  15. 15.
    Jing G, Shi Y, Kong D, Ding W, Yin B (2012) Image super-resolution based on multi-space sparse representation. Multimed Tools Appl. doi: 10.1007/s11042-011-0953-4
  16. 16.
    Kalantidis Y, Tolias G, Avrithis Y, Phinikettos M, Spyrou E, Mylonas P, Kollias S (2011) Viral: visual image retrieval and localization. Multimed Tools Appl 51:555–592CrossRefGoogle Scholar
  17. 17.
    Kawulok M (2010) Energy-based blob analysis for improving precision of skin segmentation. Multimed Tools Appl 49:463–481CrossRefGoogle Scholar
  18. 18.
    Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: NIPS, pp 801–808Google Scholar
  19. 19.
    Levinshtein A, Dickinson S, Sminchisescu C (2009) Multiscale symmetric part detection and grouping. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 2162–2169Google Scholar
  20. 20.
    Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31:2290–2297CrossRefGoogle Scholar
  21. 21.
    Li H, Ngan KN (2007) Unsupervised video segmentation with low depth of field. IEEE Trans Circuits Syst Video Technol 17:1742–1751CrossRefGoogle Scholar
  22. 22.
    Li H, Ngan K (2008) Saliency model based face segmentation in head-and-shoulder video sequences. J Vis Commun Image Represent 19:320–333CrossRefGoogle Scholar
  23. 23.
    Li H, Ngan K, Liu Q (2009) Faceseg: automatic face segmentation for real-time video. IEEE Trans Multimedia 11:77–88CrossRefGoogle Scholar
  24. 24.
    Li H, Ngan KN (2011) A co-saliency model of image pairs. IEEE Trans Image Process 20:3365–3375CrossRefMathSciNetGoogle Scholar
  25. 25.
    Li H, Ngan K (2011) Learning to extract focused objects from low dof images. IEEE Trans Circuits Syst Video Technol 21:1571–1580CrossRefzbMATHGoogle Scholar
  26. 26.
    Li Z, Wu X, Chang S (2012) Segmentation using superpixels: a bipartite graph partitioning approach. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 789–796Google Scholar
  27. 27.
    Liu Q, Han T, Sun Y, Chu Z, Shen B (2012) A two step salient objects extraction framework based on image segmentation and saliency detection. Multimed Tools Appl 67(1):231–247Google Scholar
  28. 28.
    Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2009) Non-local sparse models for image restoration. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 2272–2279Google Scholar
  29. 29.
    Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: IEEE International Conference on Computer Vision, ICCV 2001, pp 416–423Google Scholar
  30. 30.
    Meng F, Li H, Liu G, Ngan KN (2012) Object co-segmentation based on shortest path algorithm and saliency model. IEEE Trans Multimedia 14:1429–1441CrossRefGoogle Scholar
  31. 31.
    Nowozin S, Gehler PV, Lampert CH (2010) On parameter learning in crf-based approaches to object class image segmentation. In: European Conference on Computer Vision, ECCV 2010, pp 98–111Google Scholar
  32. 32.
    Olshausen BA, Fieldt DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis Res 37:3311–3325CrossRefGoogle Scholar
  33. 33.
    Pati YC, Rezaiifar R, Rezaiifar YCPR, Krishnaprasad PS (2012) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th annual asilomar conference on signals, systems, and computers, pp. 40–44Google Scholar
  34. 34.
    Radhakrishna A, Appu S, Kevin S, Aurelien L, Pascal F, Sabine S (2010) SLIC Superpixels, EPFL Technical Report no. 149300Google Scholar
  35. 35.
    Ren X, Malik J (2003) Learning a classification model for segmentation. In: IEEE International Conference on Computer Vision, ICCV 2003, pp 10–17Google Scholar
  36. 36.
    Shi J, Malik J (1997) Normalized cuts and image segmentation. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 1997, pp 731–737Google Scholar
  37. 37.
    Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81:2–23CrossRefGoogle Scholar
  38. 38.
    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288zbMATHMathSciNetGoogle Scholar
  39. 39.
    Tighe J, Lazebnik S (2010) Superparsing: scalable nonparametric image parsing with superpixels. In: European Conference on Computer Vision, ECCV 2010, pp 352–365Google Scholar
  40. 40.
    Tropp J, Wright S (2010) Computational methods for sparse solution of linear inverse problems. Proc IEEE 98:948–958Google Scholar
  41. 41.
    Vazquez-Reina A, Avidan S, Pfister H, Miller E (2010) Multiple hypothesis video segmentation from superpixel flows. In: European Conference on Computer Vision, ECCV 2010, pp 268–281Google Scholar
  42. 42.
    Vedaldi A, Fulkerson B (2008) VLFeat: an open and portable library of computer vision algorithms. http://www.vlfeat.org/
  43. 43.
    Vedaldi A, Soatto S (2008) Quick shift and kernel methods for mode seeking. In: European Conference on Computer Vision, ECCV 2008, pp 705–718Google Scholar
  44. 44.
    Vieux R, BenoisPineau J, Domenger J, Braquelaire A (2012) Segmentation-based multi-class semantic object detection. Multimed Tools Appl 60:305–326CrossRefGoogle Scholar
  45. 45.
    Wright J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31:210–227CrossRefGoogle Scholar
  46. 46.
    Wright J, Ma Y, Mairal J, Sapiro G, Huang T, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98:1031–1044CrossRefGoogle Scholar
  47. 47.
    Xu C, Corso J (2012) Evaluation of super-voxel methods for early video processing. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 1202–1209Google Scholar
  48. 48.
    Yang J, Wright J, Huang T, Ma Y (2008) Image super-resolution as sparse representation of raw image patches. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2008, pp 1–8Google Scholar
  49. 49.
    Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classiffication. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2009, pp 1794–1801Google Scholar
  50. 50.
    Yang M, Zhang D, Zhang D, Wang S (2012) Relaxed collaborative representation for pattern classiffication. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 2224–2231Google Scholar
  51. 51.
    Zhang H, Yang J, Zhang Y, Nasrabadi N, Huang T (2011) Close the loop: joint blind image restoration and recognition with sparse representation prior. In: IEEE international Conference on Computer Vision, ICCV 2011, pp 770–777Google Scholar
  52. 52.
    Zhang D, Zhu P, Hu Q, Zhang D (2011) A linear subspace learning approach via sparse coding. In: IEEE international Conference on Computer Vision, ICCV 2011, pp 755–761Google Scholar
  53. 53.
    Zhang D, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV), ICCV 2011, pp 471–478Google Scholar
  54. 54.
    Zhao J, Ching S, Cheung S (2012) Human segmentation by geometrically fusing visible-light and thermal imageries Multimed Tools Appl. doi: 10.1007/s11042-012-1299-2
  55. 55.
    Zhu Y, Papademetris X, Sinusas A, Duncan J (2010) Segmentation of the left ventricle from cardiac mr images using a subject-specific dynamical model. IEEE Trans Med Imaging 29:669–687CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.School of Electronic EngineeringUniversity of Electronic Science and Technology of China ChengduChengduChina

Personalised recommendations