Skip to main content

Class-Agnostic Counting

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Abstract

Nearly all existing counting methods are designed for a specific object class. Our work, however, aims to create a counting model able to count any class of object. To achieve this goal, we formulate counting as a matching problem, enabling us to exploit the image self-similarity property that naturally exists in object counting problems.

We make the following three contributions: first, a Generic Matching Network (GMN) architecture that can potentially count any object in a class-agnostic manner; second, by reformulating the counting problem as one of matching objects, we can take advantage of the abundance of video data labeled for tracking, which contains natural repetitions suitable for training a counting model. Such data enables us to train the GMN. Third, to customize the GMN to different user requirements, an adapter module is used to specialize the model with minimal effort, i.e. using a few labeled examples, and adapting only a small fraction of the trained parameters. This is a form of few-shot learning, which is practical for domains where labels are limited due to requiring expert knowledge (e.g. microbiology).

We demonstrate the flexibility of our method on a diverse set of existing counting benchmarks: specifically cells, cars, and human crowds. The model achieves competitive performance on cell and crowd counting datasets, and surpasses the state-of-the-art on the car dataset using only three training images. When training on the entire dataset, the proposed method outperforms all previous methods by a large margin.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Arandjelovic, O.: Crowd detection from still images. In: Proceedings of BMVC (2008)

    Google Scholar 

  2. Arteta, C., Lempitsky, V., Noble, J.A., Zisserman, A.: Interactive object counting. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 504–518. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_33

    Chapter  Google Scholar 

  3. Arteta, C., Lempitsky, V., Noble, J.A., Zisserman, A.: Detecting overlapping instances in microscopy images using extremal region trees. Med. Image Anal. 27, 3–16 (2015)

    Article  Google Scholar 

  4. Arteta, C., Lempitsky, V., Zisserman, A.: Counting in the wild. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 483–498. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_30

    Chapter  Google Scholar 

  5. Barinova, O., Lempitsky, V., Kohli, P.: On the detection of multiple object instances using Hough transforms. In: Proceedings of CVPR (2010)

    Google Scholar 

  6. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56

    Chapter  Google Scholar 

  7. Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: Proceedings of CVPR, pp. 60–65 (2005)

    Google Scholar 

  8. Cho, S., Chow, T., Leung, C.: A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 29(4), 535–541 (2009)

    Google Scholar 

  9. Dekel, T., Oron, S., Rubinstein, M., Avidan, S., Freeman, W.: Best-buddies similarity for robust template matching. In: Proceedings of CVPR (2015)

    Google Scholar 

  10. Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: Proceedings of ICCV (2009)

    Google Scholar 

  11. Efros, A., Leung, T.: Texture synthesis by non-parametric sampling. In: Proceedings of ICCV, pp. 1039–1046, September 1999

    Google Scholar 

  12. Fiaschi, L., Nair, R., Köethe, U., Hamprecht, F.: Learning to count with regression forest and structured labels. In: Proceedings of ICPR (2012)

    Google Scholar 

  13. Glasner, D., Bagon, S., Irani, M.: Super-resolution from a single image. In: Proceedings of ICCV (2009)

    Google Scholar 

  14. Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.: Matchnet: unifying feature and metric learning for patch-based matching. In: Proceedings of CVPR (2015)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR (2016)

    Google Scholar 

  16. Hsieh, M., Lin, Y., Hsu, W.: Drone-based object counting by spatially regularized regional proposal networks. In: Proceedings of ICCV (2017)

    Google Scholar 

  17. Idrees, H., et al.: Composition loss for counting, density map estimation and localization in dense crowds. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 544–559. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_33

    Chapter  Google Scholar 

  18. Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML 2015 Deep Learning Workshop (2015)

    Google Scholar 

  19. Kong, D., Gray, D., Tao, H.: A viewpoint invariant approach for crowd counting. In: Proceedings of ICPR, vol. 3, pp. 1187–1190. IEEE (2006)

    Google Scholar 

  20. Lehmussola, A., Ruusuvuori, P., Selinummi, J., Huttunen, H., Yli-Harja, O.: Computational framework for simulating fluorescence microscope images with cell populations. IEEE Trans. Med. Imaging 26(7), 1010–1016 (2007)

    Article  Google Scholar 

  21. Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: NIPS (2010)

    Google Scholar 

  22. Leung, T., Malik, J.: Detecting, localizing and grouping repeated scene elements from an image. In: Buxton, B., Cipolla, R. (eds.) ECCV 1996. LNCS, vol. 1064, pp. 546–555. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0015565

    Chapter  Google Scholar 

  23. Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMs for object detection and beyond. In: Proceedings of ICCV (2011)

    Google Scholar 

  24. Marana, A., Velastin, S., Costa, L., Lotufo, R.: Estimation of crowd density using image processing. In: Image Processing for Security Applications, p. 11/1 (1997)

    Google Scholar 

  25. Marsden, M., McGuinness, K., Little, S., Keogh, C.E., O’Connor, N.E.: People, penguins and petri dishes: adapting object counting models to new visual domains and object types without forgetting. In: Proceedings of CVPR (2018)

    Google Scholar 

  26. Mundhenk, T.N., Konjevod, G., Sakla, W.A., Boakye, K.: A large contextual dataset for classification, detection and counting of cars with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 785–800. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_48

    Chapter  Google Scholar 

  27. Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 278–293. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_17

    Chapter  Google Scholar 

  28. Rebuffi, S.A., Bilen, H., Vedaldi, A.: Efficient parametrization of multi-domain deep neural networks. In: Proceedings of CVPR (2018)

    Google Scholar 

  29. Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of CVPR (2016)

    Google Scholar 

  30. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2016)

    Google Scholar 

  31. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  32. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.: Learning to compare: relation network for few-shot learning. In: Proceedings of CVPR (2018)

    Google Scholar 

  33. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: NIPS (2016)

    Google Scholar 

  34. Xie, W., Noble, J.A., Zisserman, A.: Microscopy cell counting with fully convolutional regression networks. In: MICCAI 1st Workshop on Deep Learning in Medical Image Analysis (2015)

    Google Scholar 

  35. Zhang, C., Li, X., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of CVPR (2015)

    Google Scholar 

  36. Zhang, C., Yarkony, J., Hamprecht, F.A.: Cell detection and segmentation using correlation clustering. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8673, pp. 9–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10404-1_2

    Chapter  Google Scholar 

  37. Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of CVPR (2016)

    Google Scholar 

Download references

Acknowledgements

Funding for this research is provided by the Oxford-Google DeepMind Graduate Scholarship, and by the EPSRC Programme Grant Seebibyte EP/M013774/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weidi Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lu, E., Xie, W., Zisserman, A. (2019). Class-Agnostic Counting. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20893-6_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20892-9

  • Online ISBN: 978-3-030-20893-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics