Skip to main content

Scale-Insensitive MSER Features: A Promising Tool for Meaningful Segmentation of Images

  • Chapter
  • First Online:
  • 751 Accesses

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 145))

Abstract

Automatic annotation of image contents can be performed more efficiently if it is supported by reliable segmentation algorithms which can extract, as accurately as possible, areas with a certain level of semantic uniformity on top of the default pictorial uniformity of regions extracted by the segmentation methods. Obviously, the results should be insensitive to noise, textures, and other effects typically distorting such uniformities. This chapter discusses a segmentation technique based on SIMSER (scale-insensitive maximally stable extremal regions) features, which are a generalization of popular MSER features. Promising conformity (at least in selected applications) of such segmentation results with semantic image interpretation is shown. Additionally, the approach has a relatively low computational complexity \((O(log n\times n)\) or \(O(log n\times n\times log(log(n)))\), where n is the image resolution) which makes it prospectively instrumental in real-time applications and/or in low-cost mobile devices. First, the chapter presents fundamentals of SIMSER detector (and the original MSER detector) in gray-level images. Then, relations between semantics-based image annotation and SIMSER features are investigated and illustrated by extensive experiments (including color images, which are the main area of interest).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.robots.ox.ac.uk/~vgg/research/affine/.

References

  1. Hadbury, A.: A survey of methods for image annotation. J. Vis. Lang. Comput. 19, 617–627 (2008)

    Article  Google Scholar 

  2. Liu, Y., Zhang, D., Lua, G., Ma, W.-Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40, 262–282 (2007)

    Article  MATH  Google Scholar 

  3. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)

    Article  Google Scholar 

  4. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Ghahramani, Z., et al. (eds.) Advances in Neural Information Processing Systems 27 (NIPS 2014), pp. 487–495. Curran Associates, Inc. (2014)

    Google Scholar 

  5. Pal, N.R., Pal, S.K.: A review on image segmentation techniques. Pattern Recogn. 26, 1277–1294 (1993)

    Article  Google Scholar 

  6. Zaitouna, N.M., Aqel, M.J.: Survey on image segmentation techniques. Procedia Comput. Sci. 65, 797–806 (2015)

    Article  Google Scholar 

  7. Śluzek, A.: Local Detection and Identification of Visual Data: Selected Techniques and Applications. LAP, Saarbrucken (2013)

    Google Scholar 

  8. Belaid, L.J., Mourou, W.: Image segmentation: a watershed transformation algorithm. Image Anal. Stereol. 28(2), 93–102 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Thoma, M.: A survey of semantic segmentation. https://arxiv.org/pdf/1602.06541. Accessed 27 Apr 2017

  10. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British Machine Vision Conference BMVC 2002, pp. 384–393 (2002)

    Google Scholar 

  11. Wu, Zh., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: Proceedings of 2009 IEEE Conference Computer Vision & Pattern Recognition CVPR 2009, vol. 1, pp. 25–32 (2009)

    Google Scholar 

  12. Donoser, M., Bischof, H.: Efficient maximally stable extremal region (MSER) tracking. In: Proceedings of 2006 IEEE Conference Computer Vision & Pattern Recognition CVPR 2006, vol. 1, pp. 553–560 (2006)

    Google Scholar 

  13. Gómez, L., Karatzas, D.: MSER-based real-time text detection and tracking. In: Proceedings of 22nd International Conference on Pattern Recognition ICPR 2014, pp. 3110–3115 (2014)

    Google Scholar 

  14. Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Proceedings of 10th European Conference ECCV 2008. vol. 2, pp. 183–196 (2008)

    Google Scholar 

  15. Salahat, E., Saleh, H., Sluzek, A., Al-Qutayri, M., Mohammad, B., Elnaggar, M.: Architecture and method for real-time parallel detection and extraction of maximally stable extremal regions (MSERs). US Patent 9,311,555, 12 Apr 2016

    Google Scholar 

  16. Salahat, E., Saleh, H., Sluzek, A., Al-Qutayri, M., Mohammad, B., Elnaggar, M.: Hardware architecture for real-time extraction of maximally stable extremal regions (MSERs). US Patent 9,489,578, 8 Nov 2016

    Google Scholar 

  17. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. PAMI. 27, 1615–1630 (2005)

    Article  Google Scholar 

  18. Forssén, P-E., Lowe, D.G.: Shape descriptors for maximally stable extremal regions. In: Proceedings of 11th IEEE International Conference on Computer Vision ICCV 2007, pp. 1–8 (2007)

    Google Scholar 

  19. Kimmel, R., Zhang, C., Bronstein, A.M., Bronstein, M.M.: Are MSER features really interesting? IEEE Trans. PAMI. 33, 2316–2320 (2011)

    Article  Google Scholar 

  20. Martins, P., Carvalho, P., Gatta, C.: On the completeness of feature-driven maximally stable extremal regions. Pattern Recogn. Lett. 74, 9–16 (2016)

    Article  Google Scholar 

  21. Śluzek, A.: Improving performances of MSER features in matching and retrieval tasks. In: Proceedings of 14th European Conference ECCV 2016. vol. LNCS 9915, pp. 759–770 (2016)

    Google Scholar 

  22. Śluzek, A., Saleh, H.: Algorithmic foundations for hardware implementation of scale-insensitive MSER Features. In: Proceedings of 59th International Midwest Symposium Circuits & Systems MWSCAS 2016, pp. 1–4 (2016)

    Google Scholar 

  23. Donoser, M., Bischof, H., Wiltsche, M.: Color blob segmentation by MSER analysis. In: Proceedings of IEEE International Conference on Image Processing ICIP 2006, pp. 757–760 (2006)

    Google Scholar 

  24. Gui, Y., Zhang, X., Shang, Y.: SAR image segmentation using MSER and improved spectral clustering. EURASIP J. Adv. Sig. Process. 83 (2012)

    Google Scholar 

  25. Oh, I.S., Lee, J., Majumder, A.: Multi-scale image segmentation using MSER. In: Proceedings of 15th International Conference CAIP 2013, vol. II, pp. 201–208 (2013)

    Google Scholar 

  26. Wang, G., Gao, K., Zhang, Y., Li, J.: Efficient perceptual region detector based on object boundary. In: Proceedings of 22nd International Conference on Multimedia Modeling MMM 2016, vol. II, pp. 68–78 (2016)

    Google Scholar 

  27. Li, H., Cai, J., Nguyen, T.N.A., Zheng, J.: A benchmark for semantic image segmentation. In: Proceedings of IEEE International Conference Multimedia and Expo ICME 2013 (2013)

    Google Scholar 

  28. Lindeberg, T.: Feature detection with automatic scale selection. Int. J. Comput. Vis. 30, 77–116 (1998)

    Google Scholar 

  29. Śluzek, A.: MSER and SIMSER regions: A link between local features and image segmentation. In: Proceedings of International Conference on Computer Graphics & Digital Image Processing CGDIP 2017, Article 15 (2017)

    Google Scholar 

Download references

Acknowledgements

Some results presented in this paper have been supported by the ATIC-SRC Center within Energy Efficient Electronic Systems contract 2013-HJ-2440 for the task A Low-Power System-on-Chip Detector and Descriptor of Visual Keypoints for Video Surveillance Applications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Śluzek .

Editor information

Editors and Affiliations

Appendix

Appendix

The Appendix contains details of computational steps in SIMSER detection, focusing on the prospective hardware or hardware-supported implementations. However, such details cannot be fully explained without an insight into the detection of MSER features. Thus, the included information (a summary of the results presented in [22]) covers most important facts on architectures used in MSER and SIMSER detection, as well as architectures specifically proposed for SIMSER detection only.

Detection of local minima in the threshold space

At each threshold level, the binary image of \(M \times N\) size is represented by three data structures:

  • Seed matrix of regions SM (of the same size as the image) with the initial content \(SM_{i,j}= M\times (i-1)+j\), i.e. each pixel is a seed for itself. After processing, \(SM_{i,j}=K\), where K indicates the initial pixel (seed) of the region to which (ij) pixel belongs.

  • Region Size matrix RS (of the same size) specifying the size of region to which each (ij) pixel belongs. Initially, \(RS_{i,j}=1\), i.e. each pixel is a separate region of unit size.

  • Map-of-regions array, which for each image region lists its seed, the binary color and the size.

A small binary image and the final contents of its SM and RS matrices are shown in Fig. 3.8, while its Map-of-regions is given in Table 3.2.

Fig. 3.8
figure 8

A small binary region and its final SM and RS matrices

Table 3.2 Map-of-regions for Fig. 3.8 image

Given such representations for the sequence of binary regions over three neighboring threshold levels (note that such regions are always nested) the local minima of \(q_Q\) (see Eq. 3.2) and \(qt_Q\) (see Eq. 3.4) growth-rate functions can be straightforwardly identified. In other words, MSER regions can be detected or SIMSER candidates (i.e. the regions which satisfy the local minimum criterion in the threshold space) can be pre-selected.

Detection of local minima in the scale space

To identify SIMSER blobs, the regions pre-selected as the local minima in the threshold space should also be confirmed as the local minima in the scale space, i.e. the minima the second growth-rate function \(qs_Q\) (see Eq. 3.5). To verify this, two operations are needed:

  • The original input image should be repetitively processed by a smoothing filter. This is just a convolution with the filter kernel, i.e. the operation which can be straightforwardly into hardware. Its computational complexity is O(n).

  • The correspondences between binary regions in the neighboring scales should be established and, based on that, the values of \(qs_Q\) growth-rate evaluated. This is not a straightforward operation because binary regions over a sequence of scales often do not nest (a simple example is shown in Fig. 3.9).

Fig. 3.9
figure 9

An example of not nested (overlapping) black and white regions over two neighboring scales (smoothing removes sharp extremes, both dark and bright)

To solve this problem, the following pseudocode is proposed (its less effective variant which, nevertheless, clearly indicates O(n) complexity of the algorithm was given in [21]):

Evaluation of \(qs_Q\) growth-rate function

figure a

The scheme takes two binary images (at the same threshold but at the neighboring scales) their RS and SR matrices, and their maps-of-regions (see above). For each binary region at the current scale, the identifier of the next-scale region is found, and the value of the growth-rate function \(qs_Q\) is evaluated. Therefore, the changes of \(qs_Q\) can be tracked over the scales, and the local minima can be easily found.

In this way, all operations needed to identify SIMSER features are completed.

As an example, a pair of binary images from two neighboring scales is shown in Fig. 3.10, and the corresponding results of the above operations are included in Table 3.3. In this example, Region 4 has the best chance to be a local minimum (with the smallest value of \(qs_Q\)). To confirm that, however, similarly computed values of \(qs_Q\) for Region C (which is the correspondence of Region 4 in the next scale) and for the corresponding region in the previous scale, should be larger (Fig. 3.11).

Fig. 3.10
figure 10

Computing \(qs_Q\) growth-rate function in the scale space. The left image is in the current scale, while the right one in the next scale

Table 3.3 \(qs_Q\) processing for Fig. 3.10 images. MX values are the region intersection sizes

Altogether, it can be concluded that SIMSER detection architecture is a relatively simple extpansion of the MSER detection architecture, so that hardware implementation of SIMSER detector is a feasible task.

Fig. 3.11
figure 11

Original images (a,b,c) and the SIMSER-based segmentation results obtained from: grey-level copies (d,e,f), red (g,h,i), green (j,k,l) and blue (m,n,o) channels

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Śluzek, A. (2018). Scale-Insensitive MSER Features: A Promising Tool for Meaningful Segmentation of Images. In: Kwaśnicka, H., Jain, L. (eds) Bridging the Semantic Gap in Image and Video Analysis. Intelligent Systems Reference Library, vol 145. Springer, Cham. https://doi.org/10.1007/978-3-319-73891-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73891-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73890-1

  • Online ISBN: 978-3-319-73891-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics