Skip to main content

Abstract

As early as the nineteenth century, it was observed that humans can effortlessly identify the number of items in the range of 1–4 by a glance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cs.bu.edu/groups/ivc/Subitizing/.

  2. 2.

    We use the subset of ImageNet images with bounding box annotations.

  3. 3.

    The F-score is computed as \(\frac {2RP}{(R+P)}\), where R and P denote recall and precision respectively.

  4. 4.

    When evaluated on the test set used by [200], our best method GoogleNet_Syn_FT achieves a mAP score of 85.0%.

  5. 5.

    https://stock.adobe.com.

References

  1. Anoraganingrum, D. Cell segmentation with median filter and mathematical morphology operation. In International Conference on Image Analysis and Processing (1999).

    Google Scholar 

  2. Arteta, C., Lempitsky, V., Noble, J. A., and Zisserman, A. Interactive object counting. In European Conference on Computer Vision (ECCV) (2014).

    Google Scholar 

  3. Atkinson, J., Campbell, F. W., and Francis, M. R. The magic number 4±0: A new look at visual numerosity judgements. Perception 5, 3 (1976), 327–34.

    Article  Google Scholar 

  4. Berg, T. L., and Berg, A. C. Finding iconic images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2009).

    Google Scholar 

  5. Borji, A., Sihite, D. N., and Itti, L. Salient object detection: A benchmark. In European Conference on Computer Vision (ECCV) (2012).

    Chapter  Google Scholar 

  6. Boysen, S. T., and Capaldi, E. J. The development of numerical competence: Animal and human models. Psychology Press, 2014.

    Google Scholar 

  7. Chan, A. B., Liang, Z.-S., and Vasconcelos, N. Privacy preserving crowd monitoring: Counting people without people models or tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008).

    Google Scholar 

  8. Chan, A. B., and Vasconcelos, N. Bayesian Poisson regression for crowd counting. In IEEE International Conference on Computer Vision (ICCV) (2009).

    Google Scholar 

  9. Chatfield, K., Lempitsky, V., Vedaldi, A., and Zisserman, A. The devil is in the details: an evaluation of recent feature encoding methods. In British Machine Vision Conference (BMVC) (2011).

    Google Scholar 

  10. Cheng, M.-M., Mitra, N. J., Huang, X., Torr, P. H. S., and Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 37, 3 (2015), 569–582.

    Article  Google Scholar 

  11. Choi, J., Jung, C., Lee, J., and Kim, C. Determining the existence of objects in an image and its application to image thumbnailing. Signal Processing Letters 21, 8 (2014), 957–961.

    Article  Google Scholar 

  12. Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y. NUS-WIDE: A real-world web image database from National University of Singapore. In ACM International Conference on Image and Video Retrieval (2009).

    Google Scholar 

  13. Clements, D. H. Subitizing: What is it? why teach it? Teaching children mathematics 5 (1999), 400–405.

    Google Scholar 

  14. Davis, H., and Pérusse, R. Numerical competence in animals: Definitional issues, current evidence, and a new research agenda. Behavioral and Brain Sciences 11, 04 (1988), 561–579.

    Article  Google Scholar 

  15. Dehaene, S. The number sense: How the mind creates mathematics. Oxford University Press, 2011.

    Google Scholar 

  16. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.

  17. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 32, 9 (2010), 1627–1645.

    Article  Google Scholar 

  18. Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).

    Google Scholar 

  19. Gross, H. J. The magical number four: A biological, historical and mythological enigma. Communicative & integrative biology 5, 1 (2012), 1–2.

    Article  Google Scholar 

  20. Gross, H. J., Pahl, M., Si, A., Zhu, H., Tautz, J., and Zhang, S. Number-based visual generalisation in the honeybee. PLoS One 4, 1 (2009), e4263.

    Article  Google Scholar 

  21. Gurari, D., and Grauman, K. Visual question: Predicting if a crowd will agree on the answer. arXiv preprint arXiv:1608.08188 (2016).

    Google Scholar 

  22. Heo, J.-P., Lin, Z., and Yoon, S.-E. Distance encoded product quantization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).

    Google Scholar 

  23. Jaderberg, M., Simonyan, K., Vedaldi, A., and Zisserman, A. Synthetic data and artificial neural networks for natural scene text recognition. In Advances in Neural Information Processing Systems (NIPS) Workshop (2014).

    Google Scholar 

  24. Jansen, B. R., Hofman, A. D., Straatemeier, M., Bers, B. M., Raijmakers, M. E., and Maas, H. L. The role of pattern recognition in children’s exact enumeration of small numbers. British Journal of Developmental Psychology 32, 2 (2014), 178–194.

    Article  Google Scholar 

  25. Jevons, W. S. The power of numerical discrimination. Nature 3, 67 (1871), 281–282.

    Article  Google Scholar 

  26. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia (2014).

    Google Scholar 

  27. Kaufman, E., Lord, M., Reese, T., and Volkmann, J. The discrimination of visual number. The American Journal of Psychology (1949), 498–525.

    Article  Google Scholar 

  28. Kazemzadeh, S., Ordonez, V., Matten, M., and Berg, T. L. Referitgame: Referring to objects in photographs of natural scenes. In Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014).

    Google Scholar 

  29. Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (2012).

    Google Scholar 

  30. Lee, Y. J., Ghosh, J., and Grauman, K. Discovering important people and objects for egocentric video summarization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012).

    Google Scholar 

  31. Lempitsky, V., and Zisserman, A. Learning to count objects in images. In Advances in Neural Information Processing Systems (NIPS) (2010).

    Google Scholar 

  32. Li, X., Uricchio, T., Ballan, L., Bertini, M., Snoek, C. G. M., and Bimbo, A. D. Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval. ACM Computing Surveys 49, 1 (June 2016), 14:1–14:39.

    Article  Google Scholar 

  33. Li, Y., Hou, X., Koch, C., Rehg, J. M., and Yuille, A. L. The secrets of salient object segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).

    Google Scholar 

  34. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV) (2014).

    Google Scholar 

  35. Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., and Shum, H.-Y. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 33, 2 (2011), 353–367.

    Google Scholar 

  36. Mandler, G., and Shebo, B. J. Subitizing: an analysis of its component processes. Journal of Experimental Psychology: General 111, 1 (1982), 1.

    Article  Google Scholar 

  37. Nath, S. K., Palaniappan, K., and Bunyak, F. Cell segmentation using coupled level sets and graph-vertex coloring. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2006).

    Chapter  Google Scholar 

  38. Pahl, M., Si, A., and Zhang, S. Numerical cognition in bees and other insects. Frontiers in psychology 4 (2013).

    Google Scholar 

  39. Peng, X., Sun, B., Ali, K., and Saenko, K. Learning deep object detectors from 3d models. In IEEE International Conference on Computer Vision (ICCV) (2015).

    Google Scholar 

  40. Piazza, M., and Dehaene, S. From number neurons to mental arithmetic: The cognitive neuroscience of number sense. The cognitive neurosciences, 3rd edition (2004), 865–77.

    Google Scholar 

  41. Pinheiro, P. O., Lin, T.-Y., Collobert, R., and Dollár, P. Learning to refine object segments. In European Conference on Computer Vision (ECCV) (2016).

    Chapter  Google Scholar 

  42. Pont-Tuset, J., Arbelaez, P., Barron, J. T., Marques, F., and Malik, J. Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE transactions on pattern analysis and machine intelligence 39, 1 (2017), 128–140.

    Article  Google Scholar 

  43. Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. CNN features off-the-shelf: an astounding baseline for recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DeepVision Workshop (2014).

    Google Scholar 

  44. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. Imagenet large scale visual recognition challenge, 2014.

    Google Scholar 

  45. Scharfenberger, C., Waslander, S. L., Zelek, J. S., and Clausi, D. A. Existence detection of objects in images for robot vision using saliency histogram features. In International Conference on Computer and Robot Vision (2013).

    Google Scholar 

  46. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. In International Conference on Learning Representations (ICLR) (2014).

    Google Scholar 

  47. Shin, D., He, S., Lee, G. M., Whinston, A. B., Cetintas, S., and Lee, K.-C. Content complexity, similarity, and consistency in social media: A deep learning approach. https://ssrn.com/abstract=2830377, 2016.

  48. Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR) (2015).

    Google Scholar 

  49. Stark, M., Goesele, M., and Schiele, B. Back to the future: Learning shape models from 3D CAD data. In British Machine Vision Conference (BMVC) (2010).

    Google Scholar 

  50. Stoianov, I., and Zorzi, M. Emergence of a visual number sense in hierarchical generative models. Nature neuroscience 15, 2 (2012), 194–196.

    Article  Google Scholar 

  51. Subburaman, V. B., Descamps, A., and Carincotte, C. Counting people in the crowd using a generic head detector. In IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) (2012).

    Google Scholar 

  52. Sun, B., and Saenko, K. From virtual to reality: Fast adaptation of virtual object detectors to real domains. In British Machine Vision Conference (BMVC) (2014).

    Google Scholar 

  53. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

    Google Scholar 

  54. Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. Context-based vision system for place and object recognition. In IEEE International Conference on Computer Vision (ICCV) (2003).

    Google Scholar 

  55. Trick, L. M., and Pylyshyn, Z. W. Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision. Psychological review 101, 1 (1994), 80.

    Article  Google Scholar 

  56. Vedaldi, A., and Fulkerson, B. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008.

  57. Vuilleumier, P. O., and Rafal, R. D. A systematic study of visual extinction between-and within-field deficits of attention in hemispatial neglect. Brain 123, 6 (2000), 1263–1279.

    Article  Google Scholar 

  58. Wang, P., Wang, J., Zeng, G., Feng, J., Zha, H., and Li, S. Salient object detection for searched web images via global saliency. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012).

    Google Scholar 

  59. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010).

    Google Scholar 

  60. Xiong, B., and Grauman, K. Detecting snap points in egocentric video with a web photo prior. In European Conference on Computer Vision (ECCV) (2014).

    Google Scholar 

  61. Xu, K., Ba, J., Kiros, R., Courville, A., Salakhutdinov, R., Zemel, R., and Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015).

    Google Scholar 

  62. Zhang, J., Ma, S., Sameki, M., Sclaroff, S., Betke, M., Lin, Z., Shen, X., Price, B., and Měch, R. Salient object subitizing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

    Google Scholar 

  63. Zhang, J., Sclaroff, S., Lin, Z., Shen, X., Price, B., and Měch, R. Unconstrained salient object detection via proposal subset optimization. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2016).

    Google Scholar 

  64. Zhao, R., Ouyang, W., Li, H., and Wang, X. Saliency detection by multi-context deep learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

    Google Scholar 

  65. Zou, W. Y., and McClelland, J. L. Progressive development of the number sense in a deep neural network. In Annual Conference of the Cognitive Science Society (CogSci) (2013).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhang, J., Malmberg, F., Sclaroff, S. (2019). Salient Object Subitizing. In: Visual Saliency: From Pixel-Level to Object-Level Analysis. Springer, Cham. https://doi.org/10.1007/978-3-030-04831-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04831-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04830-3

  • Online ISBN: 978-3-030-04831-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics