Advertisement

Salient Object Subitizing

  • Jianming Zhang
  • Filip Malmberg
  • Stan Sclaroff
Chapter

Abstract

As early as the nineteenth century, it was observed that humans can effortlessly identify the number of items in the range of 1–4 by a glance.

References

  1. 3.
    Anoraganingrum, D. Cell segmentation with median filter and mathematical morphology operation. In International Conference on Image Analysis and Processing (1999).Google Scholar
  2. 6.
    Arteta, C., Lempitsky, V., Noble, J. A., and Zisserman, A. Interactive object counting. In European Conference on Computer Vision (ECCV) (2014).Google Scholar
  3. 7.
    Atkinson, J., Campbell, F. W., and Francis, M. R. The magic number 4±0: A new look at visual numerosity judgements. Perception 5, 3 (1976), 327–34.CrossRefGoogle Scholar
  4. 11.
    Berg, T. L., and Berg, A. C. Finding iconic images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2009).Google Scholar
  5. 15.
    Borji, A., Sihite, D. N., and Itti, L. Salient object detection: A benchmark. In European Conference on Computer Vision (ECCV) (2012).CrossRefGoogle Scholar
  6. 17.
    Boysen, S. T., and Capaldi, E. J. The development of numerical competence: Animal and human models. Psychology Press, 2014.Google Scholar
  7. 23.
    Chan, A. B., Liang, Z.-S., and Vasconcelos, N. Privacy preserving crowd monitoring: Counting people without people models or tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008).Google Scholar
  8. 24.
    Chan, A. B., and Vasconcelos, N. Bayesian Poisson regression for crowd counting. In IEEE International Conference on Computer Vision (ICCV) (2009).Google Scholar
  9. 26.
    Chatfield, K., Lempitsky, V., Vedaldi, A., and Zisserman, A. The devil is in the details: an evaluation of recent feature encoding methods. In British Machine Vision Conference (BMVC) (2011).Google Scholar
  10. 34.
    Cheng, M.-M., Mitra, N. J., Huang, X., Torr, P. H. S., and Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 37, 3 (2015), 569–582.CrossRefGoogle Scholar
  11. 38.
    Choi, J., Jung, C., Lee, J., and Kim, C. Determining the existence of objects in an image and its application to image thumbnailing. Signal Processing Letters 21, 8 (2014), 957–961.CrossRefGoogle Scholar
  12. 39.
    Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y. NUS-WIDE: A real-world web image database from National University of Singapore. In ACM International Conference on Image and Video Retrieval (2009).Google Scholar
  13. 42.
    Clements, D. H. Subitizing: What is it? why teach it? Teaching children mathematics 5 (1999), 400–405.Google Scholar
  14. 46.
    Davis, H., and Pérusse, R. Numerical competence in animals: Definitional issues, current evidence, and a new research agenda. Behavioral and Brain Sciences 11, 04 (1988), 561–579.CrossRefGoogle Scholar
  15. 48.
    Dehaene, S. The number sense: How the mind creates mathematics. Oxford University Press, 2011.Google Scholar
  16. 56.
    Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
  17. 59.
    Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 32, 9 (2010), 1627–1645.CrossRefGoogle Scholar
  18. 62.
    Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).Google Scholar
  19. 66.
    Gross, H. J. The magical number four: A biological, historical and mythological enigma. Communicative & integrative biology 5, 1 (2012), 1–2.CrossRefGoogle Scholar
  20. 67.
    Gross, H. J., Pahl, M., Si, A., Zhu, H., Tautz, J., and Zhang, S. Number-based visual generalisation in the honeybee. PLoS One 4, 1 (2009), e4263.CrossRefGoogle Scholar
  21. 68.
    Gurari, D., and Grauman, K. Visual question: Predicting if a crowd will agree on the answer. arXiv preprint arXiv:1608.08188 (2016).Google Scholar
  22. 72.
    Heo, J.-P., Lin, Z., and Yoon, S.-E. Distance encoded product quantization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).Google Scholar
  23. 80.
    Jaderberg, M., Simonyan, K., Vedaldi, A., and Zisserman, A. Synthetic data and artificial neural networks for natural scene text recognition. In Advances in Neural Information Processing Systems (NIPS) Workshop (2014).Google Scholar
  24. 81.
    Jansen, B. R., Hofman, A. D., Straatemeier, M., Bers, B. M., Raijmakers, M. E., and Maas, H. L. The role of pattern recognition in children’s exact enumeration of small numbers. British Journal of Developmental Psychology 32, 2 (2014), 178–194.CrossRefGoogle Scholar
  25. 82.
    Jevons, W. S. The power of numerical discrimination. Nature 3, 67 (1871), 281–282.CrossRefGoogle Scholar
  26. 83.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia (2014).Google Scholar
  27. 90.
    Kaufman, E., Lord, M., Reese, T., and Volkmann, J. The discrimination of visual number. The American Journal of Psychology (1949), 498–525.CrossRefGoogle Scholar
  28. 91.
    Kazemzadeh, S., Ordonez, V., Matten, M., and Berg, T. L. Referitgame: Referring to objects in photographs of natural scenes. In Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014).Google Scholar
  29. 100.
    Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (2012).Google Scholar
  30. 104.
    Lee, Y. J., Ghosh, J., and Grauman, K. Discovering important people and objects for egocentric video summarization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012).Google Scholar
  31. 106.
    Lempitsky, V., and Zisserman, A. Learning to count objects in images. In Advances in Neural Information Processing Systems (NIPS) (2010).Google Scholar
  32. 109.
    Li, X., Uricchio, T., Ballan, L., Bertini, M., Snoek, C. G. M., and Bimbo, A. D. Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval. ACM Computing Surveys 49, 1 (June 2016), 14:1–14:39.CrossRefGoogle Scholar
  33. 110.
    Li, Y., Hou, X., Koch, C., Rehg, J. M., and Yuille, A. L. The secrets of salient object segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014).Google Scholar
  34. 111.
    Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV) (2014).Google Scholar
  35. 112.
    Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., and Shum, H.-Y. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 33, 2 (2011), 353–367.Google Scholar
  36. 121.
    Mandler, G., and Shebo, B. J. Subitizing: an analysis of its component processes. Journal of Experimental Psychology: General 111, 1 (1982), 1.CrossRefGoogle Scholar
  37. 129.
    Nath, S. K., Palaniappan, K., and Bunyak, F. Cell segmentation using coupled level sets and graph-vertex coloring. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2006).CrossRefGoogle Scholar
  38. 132.
    Pahl, M., Si, A., and Zhang, S. Numerical cognition in bees and other insects. Frontiers in psychology 4 (2013).Google Scholar
  39. 134.
    Peng, X., Sun, B., Ali, K., and Saenko, K. Learning deep object detectors from 3d models. In IEEE International Conference on Computer Vision (ICCV) (2015).Google Scholar
  40. 136.
    Piazza, M., and Dehaene, S. From number neurons to mental arithmetic: The cognitive neuroscience of number sense. The cognitive neurosciences, 3rd edition (2004), 865–77.Google Scholar
  41. 137.
    Pinheiro, P. O., Lin, T.-Y., Collobert, R., and Dollár, P. Learning to refine object segments. In European Conference on Computer Vision (ECCV) (2016).CrossRefGoogle Scholar
  42. 138.
    Pont-Tuset, J., Arbelaez, P., Barron, J. T., Marques, F., and Malik, J. Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE transactions on pattern analysis and machine intelligence 39, 1 (2017), 128–140.CrossRefGoogle Scholar
  43. 139.
    Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. CNN features off-the-shelf: an astounding baseline for recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DeepVision Workshop (2014).Google Scholar
  44. 148.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. Imagenet large scale visual recognition challenge, 2014.Google Scholar
  45. 152.
    Scharfenberger, C., Waslander, S. L., Zelek, J. S., and Clausi, D. A. Existence detection of objects in images for robot vision using saliency histogram features. In International Conference on Computer and Robot Vision (2013).Google Scholar
  46. 157.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. In International Conference on Learning Representations (ICLR) (2014).Google Scholar
  47. 159.
    Shin, D., He, S., Lee, G. M., Whinston, A. B., Cetintas, S., and Lee, K.-C. Content complexity, similarity, and consistency in social media: A deep learning approach. https://ssrn.com/abstract=2830377, 2016.
  48. 160.
    Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR) (2015).Google Scholar
  49. 164.
    Stark, M., Goesele, M., and Schiele, B. Back to the future: Learning shape models from 3D CAD data. In British Machine Vision Conference (BMVC) (2010).Google Scholar
  50. 165.
    Stoianov, I., and Zorzi, M. Emergence of a visual number sense in hierarchical generative models. Nature neuroscience 15, 2 (2012), 194–196.CrossRefGoogle Scholar
  51. 167.
    Subburaman, V. B., Descamps, A., and Carincotte, C. Counting people in the crowd using a generic head detector. In IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) (2012).Google Scholar
  52. 171.
    Sun, B., and Saenko, K. From virtual to reality: Fast adaptation of virtual object detectors to real domains. In British Machine Vision Conference (BMVC) (2014).Google Scholar
  53. 172.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).Google Scholar
  54. 177.
    Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. Context-based vision system for place and object recognition. In IEEE International Conference on Computer Vision (ICCV) (2003).Google Scholar
  55. 178.
    Trick, L. M., and Pylyshyn, Z. W. Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision. Psychological review 101, 1 (1994), 80.CrossRefGoogle Scholar
  56. 182.
    Vedaldi, A., and Fulkerson, B. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008.
  57. 185.
    Vuilleumier, P. O., and Rafal, R. D. A systematic study of visual extinction between-and within-field deficits of attention in hemispatial neglect. Brain 123, 6 (2000), 1263–1279.CrossRefGoogle Scholar
  58. 186.
    Wang, P., Wang, J., Zeng, G., Feng, J., Zha, H., and Li, S. Salient object detection for searched web images via global saliency. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012).Google Scholar
  59. 190.
    Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010).Google Scholar
  60. 191.
    Xiong, B., and Grauman, K. Detecting snap points in egocentric video with a web photo prior. In European Conference on Computer Vision (ECCV) (2014).Google Scholar
  61. 192.
    Xu, K., Ba, J., Kiros, R., Courville, A., Salakhutdinov, R., Zemel, R., and Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015).Google Scholar
  62. 200.
    Zhang, J., Ma, S., Sameki, M., Sclaroff, S., Betke, M., Lin, Z., Shen, X., Price, B., and Měch, R. Salient object subitizing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).Google Scholar
  63. 204.
    Zhang, J., Sclaroff, S., Lin, Z., Shen, X., Price, B., and Měch, R. Unconstrained salient object detection via proposal subset optimization. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2016).Google Scholar
  64. 206.
    Zhao, R., Ouyang, W., Li, H., and Wang, X. Saliency detection by multi-context deep learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).Google Scholar
  65. 210.
    Zou, W. Y., and McClelland, J. L. Progressive development of the number sense in a deep neural network. In Annual Conference of the Cognitive Science Society (CogSci) (2013).Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jianming Zhang
    • 1
  • Filip Malmberg
    • 2
  • Stan Sclaroff
    • 3
  1. 1.Adobe Inc.San JoseUSA
  2. 2.Centre for Image AnalysisUppsala UniversityUppsalaSweden
  3. 3.Department of Computer ScienceBoston UniversityBostonUSA

Personalised recommendations