Advertisement

Parcel Tracking by Detection in Large Camera Networks

  • Sascha ClausenEmail author
  • Claudius Zelenka
  • Tobias Schwede
  • Reinhard Koch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)

Abstract

Inside parcel distribution hubs, several tenth of up 100 000 parcels processed each day get lost. Human operators have to tediously recover these parcels by searching through large amounts of video footage from the installed large-scale camera network. We want to assist these operators and work towards an automatic solution. The challenge lies both in the size of the hub with a high number of cameras and in the adverse conditions. We describe and evaluate an industry scale tracking framework based on state-of-the-art methods such as Mask R-CNN. Moreover, we adapt a siamese network inspired feature vector matching with a novel feature improver network, which increases tracking performance. Our calibration method exploits a calibration parcel and is suitable for both overlapping and non-overlapping camera views. It requires little manual effort and needs only a single drive-by of the calibration parcel for each conveyor belt. With these methods, most parcels can be tracked start-to-end.

Keywords

Multi-object tracking Tracking by Detection Instance segmentation Camera network calibration 

Notes

Acknowledgments

This work was supported by the Central Innovation Programme for SMEs of the Federal Ministry for Economic Affairs and Energy of Germany under grant agreement number 16KN044302.

Supplementary material

Supplementary material 1 (mp4 6877 KB)

Supplementary material 2 (mp4 11745 KB)

480455_1_En_7_MOESM3_ESM.txt (0 kb)
Supplementary material 3 (txt 1 KB)

References

  1. 1.
    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008 (2008).  https://doi.org/10.1155/2008/246309CrossRefGoogle Scholar
  2. 2.
    Bewley, A., Ge, Z., Ott, L., Ramos, F.T., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing, pp. 3464–3468 (2016).  https://doi.org/10.1109/ICIP.2016.7533003
  3. 3.
    Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, pp. 2544–2550 (2010).  https://doi.org/10.1109/CVPR.2010.5539960
  4. 4.
    Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a Siamese time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6, pp. 737–744 (1993)Google Scholar
  5. 5.
    Chahyati, D., Fanany, M.I., Arymurthy, A.M.: Tracking people by detection using CNN features. Proc. Comput. Sci. 124, 167–172 (2017).  https://doi.org/10.1016/j.procs.2017.12.143CrossRefGoogle Scholar
  6. 6.
    Danelljan, M., Khan, F.S., Felsberg, M., van de Weijer, J.: Adaptive color attributes for real-time visual tracking. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1090–1097 (2014).  https://doi.org/10.1109/CVPR.2014.143
  7. 7.
    Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F.J., Marín-Jiménez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014).  https://doi.org/10.1016/j.patcog.2014.01.005CrossRefGoogle Scholar
  8. 8.
    Grabner, H., Grabner, M., Bischof, H.: Real-time tracking via on-line boosting. In: Proceedings of the British Machine Vision Conference 2006, pp. 47–56 (2006).  https://doi.org/10.5244/C.20.6
  9. 9.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017).  https://doi.org/10.1109/ICCV.2017.322
  10. 10.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_45CrossRefGoogle Scholar
  11. 11.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33765-9_50CrossRefGoogle Scholar
  12. 12.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: 20th International Conference on Pattern Recognition, pp. 2756–2759 (2010).  https://doi.org/10.1109/ICPR.2010.675
  13. 13.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012).  https://doi.org/10.1109/TPAMI.2011.239CrossRefGoogle Scholar
  14. 14.
    Kang, K., Ouyang, W., Li, H., Wang, X.: Object detection from video tubelets with convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 817–825 (2016).  https://doi.org/10.1109/CVPR.2016.95
  15. 15.
    Karaca, H.N., Akınlar, C.: A multi-camera vision system for real-time tracking of parcels moving on a conveyor belt. In: Yolum, I., Güngör, T., Gürgen, F., Özturan, C. (eds.) ISCIS 2005. LNCS, vol. 3733, pp. 708–717. Springer, Heidelberg (2005).  https://doi.org/10.1007/11569596_73CrossRefGoogle Scholar
  16. 16.
    Kroeger, T., Timofte, R., Dai, D., Van Gool, L.: Fast optical flow using dense inverse search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 471–488. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_29CrossRefGoogle Scholar
  17. 17.
    Kuhn, H.W., Yaw, B.: The Hungarian method for the assignment problem. Naval Res. Logist. Q. 2, 83–97 (1955)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese CNN for robust target association. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 418–425 (2016).  https://doi.org/10.1109/CVPRW.2016.59
  19. 19.
    Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017).  https://doi.org/10.1109/CVPR.2017.472
  20. 20.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  21. 21.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  22. 22.
    Lukezic, A., Vojír, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter tracker with channel and spatial reliability. Int. J. Comput. Vis. 126(7), 671–688 (2018).  https://doi.org/10.1007/s11263-017-1061-3MathSciNetCrossRefGoogle Scholar
  23. 23.
    Matterport: Mask R-CNN for object detection and segmentation. https://github.com/matterport/Mask_RCNN
  24. 24.
    Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking (2016). https://arxiv.org/abs/1603.00831
  25. 25.
    Milan, A., Rezatofighi, S.H., Dick, A.R., Reid, I.D., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 4225–4232 (2017)Google Scholar
  26. 26.
    Radke, R.J., Andra, S., Al-Kofahi, O., Roysam, B.: Image change detection algorithms: a systematic survey. IEEE Trans. Image Process. 14(3), 294–307 (2005).  https://doi.org/10.1109/TIP.2004.838698MathSciNetCrossRefGoogle Scholar
  27. 27.
    Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018). https://arxiv.org/abs/1804.02767
  28. 28.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017).  https://doi.org/10.1109/TPAMI.2016.2577031CrossRefGoogle Scholar
  29. 29.
    Shin, I.S., Nam, S.H., Yu, H.G., Roberts, R.G., Moon, S.B.: Conveyor visual tracking using robot vision. In: Proceedings of 2006 Florida Conference on Recent Advances in Robotics, pp. 1–5. Citeseer (2006)Google Scholar
  30. 30.
    Tang, Z., Miao, Z., Wan, Y.: Background subtraction using running Gaussian average and frame difference. In: Ma, L., Rauterberg, M., Nakatsu, R. (eds.) ICEC 2007. LNCS, vol. 4740, pp. 411–414. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-74873-1_50CrossRefGoogle Scholar
  31. 31.
    Tomasi, C., Kanade, T.: Detection and tracking of feature points. Technical report. Carnegie Mellon University, Technical Report CMU-CS-91-132 (1991)Google Scholar
  32. 32.
    Wang, X., Türetken, E., Fleuret, F., Fua, P.: Tracking interacting objects optimally using integer programming. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 17–32. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_2CrossRefGoogle Scholar
  33. 33.
    Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, vol. 18, pp. 1473–1480 (2005)Google Scholar
  34. 34.
    Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). https://arxiv.org/abs/1212.5701

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceKiel UniversityKielGermany

Personalised recommendations