Skip to main content

Real-Time Unsupervised Object Localization on the Edge for Airport Video Surveillance

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2023)

Abstract

Object localization is vital in computer vision to solve object detection or classification problems. Typically, this task is performed on expensive GPU devices, but edge computing is gaining importance in real-time applications. In this work, we propose a real-time implementation for unsupervised object localization using a low-power device for airport video surveillance. We automatically find regions of objects in video using a region proposal network (RPN) together with an optical flow region proposal (OFRP) based on optical flow maps between frames. In addition, we study the deployment of our solution on an embedded architecture, i.e. a Jetson AGX Xavier, using simultaneously CPU, GPU and specific hardware accelerators. Also, three different data representations (FP32, FP16 and INT8) are employed for the RPN. Obtained results show that optimizations can improve up to 4.1\(\times \) energy consumption and 2.2\(\times \) execution time while maintaining good accuracy with respect to the baseline model.

Supported by the Junta de Andalucía of Spain (P18-FR-3130 and UMA20-FEDERJA-059), the Ministry of Education of Spain (PID2019-105396RB-I00) and the University of Málaga (B1-2022_04).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.airport.gdansk.pl/airport/kamery-internetowe.

  2. 2.

    https://github.com/WongKinYiu/ScaledYOLOv4.

  3. 3.

    NVIDIA Nsight Compute documentation can be consulted: https://developer.nvidia.com/nsight-compute.

  4. 4.

    NVIDIA TensorRT documentation can be consulted to find mapping incompatibilities: https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-841/developer-guide/index.html.

References

  1. Ahmed, S., Bons, M.: Edge computed NILM: a phone-based implementation using mobilenet compressed by tensorflow lite. In: NILM, pp. 44–48 (2020)

    Google Scholar 

  2. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR, pp. 6154–6162 (2018)

    Google Scholar 

  3. Chen, M., et al.: MoLoc: unsupervised fingerprint roaming for device-free indoor localization in a mobile ship environment. IEEE Internet Things J. 7(12), 11851–11862 (2020)

    Article  Google Scholar 

  4. Chen, X., Li, H., Zhou, C., Liu, X., Wu, D., Dudek, G.: Fidora: robust wifi-based indoor localization via unsupervised domain adaptation. IEEE Internet Things J. 9(12), 9872–9888 (2022)

    Article  Google Scholar 

  5. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint:1710.09282 (2017)

    Google Scholar 

  6. David, R., et al.: Tensorflow lite micro: embedded machine learning for tinyml systems. PMLR 3, 800–811 (2021)

    Google Scholar 

  7. Deng, L., Li, G., Han, S., Shi, L., Xie, Y.: Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4), 485–532 (2020)

    Article  Google Scholar 

  8. Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50

    Chapter  Google Scholar 

  9. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE PAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  11. Gong, G., Wang, X., Mu, Y., Tian, Q.: Learning temporal co-attention models for unsupervised video action localization. In: CVPR, pp. 9819–9828 (2020)

    Google Scholar 

  12. Gudovskiy, D., Ishizaka, S., Kozuka, K.: CFLOW-AD: real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: WACV 2022, pp. 98–107

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  14. Jeong, E., Kim, J., Tan, S., Lee, J., Ha, S.: Deep learning inference parallelization on heterogeneous processors with TensorRT. IEEE Embed. Syst. Lett. 14(1), 15–18 (2021)

    Article  Google Scholar 

  15. Li, Y., Hu, X., Zhuang, Y., Gao, Z., Zhang, P., El-Sheimy, N.: Deep reinforcement learning (DRL): another perspective for unsupervised wireless localization. IEEE Internet Things J. 7(7), 6279–6287 (2019)

    Article  Google Scholar 

  16. Liang, T., Glossner, J., Wang, L., Shi, S., Zhang, X.: Pruning and quantization for deep neural network acceleration: a survey. Neurocomputing 461, 370–403 (2021)

    Article  Google Scholar 

  17. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)

    Google Scholar 

  18. Liu, Z., Wang, Y., Han, K., Zhang, W., Ma, S., Gao, W.: Post-training quantization for vision transformer. NeurIPS 34, 28092–28103 (2021)

    Google Scholar 

  19. Ma, X., Ji, K., Xiong, B., Zhang, L., Feng, S., Kuang, G.: Light-yolov4: an edge-device oriented target detection method for remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 10808–10820 (2021)

    Google Scholar 

  20. Mathew, M., Desappan, K., Kumar Swami, P., Nagori, S.: Sparse, quantized, full frame cnn for low power embedded devices. In: CVPR (2017)

    Google Scholar 

  21. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)

    Google Scholar 

  22. Park, E., Yoo, S., Vajda, P.: Value-aware quantization for training and inference of neural networks. In: ECCV, pp. 580–595 (2018)

    Google Scholar 

  23. Qasaimeh, M., et al.: Benchmarking vision kernels and neural network inference accelerators on embedded platforms. J. Syst. Architect. 113, 101896 (2021)

    Article  Google Scholar 

  24. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

    Google Scholar 

  25. Ruiz-Barroso, P., Castro, F.M., Delgado-Escaño, R., Ramos-Cózar, J., Guil, N.: High performance inference of gait recognition models on embedded systems. Sustain. Comput. Inf. Syst. 36, 100814 (2022)

    Google Scholar 

  26. Saddik, A., Latif, R., Elhoseny, M., Elouardi, A.: Real-time evaluation of different indexes in precision agriculture using a heterogeneous embedded system. Sustain. Comput. Inf. Syst. 30, 100506 (2021)

    Google Scholar 

  27. Seichter, D., Köhler, M., Lewandowski, B., Wengefeld, T., Gross, H.M.: Efficient RGB-D semantic segmentation for indoor scene analysis. In: ICRA, pp. 13525–13531. IEEE (2021)

    Google Scholar 

  28. Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. CVGIP 30(1), 32–46 (1985)

    MATH  Google Scholar 

  29. Tao, Z., Li, Q.: eSGD: commutation efficient distributed deep learning on the edge. HotEdge, 6 (2018)

    Google Scholar 

  30. Viola, P., Jones, M., et al.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1, pp. 511–518 (2001)

    Google Scholar 

  31. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-yolov4: scaling cross stage partial network. In: CVPR, pp. 13029–13038 (2021)

    Google Scholar 

  32. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv: 2207.02696 (2022)

  33. Xia, X., et al.: TRT-ViT: tensorrt-oriented vision transformer. arXiv preprint:2205.09579 (2022)

    Google Scholar 

  34. Zhai, X., Kolesnikov, A., Houlsby, N., Beyer, L.: Scaling vision transformers. In: CVPR, pp. 12104–12113 (2022)

    Google Scholar 

  35. Zhao, K., et al.: Distribution adaptive int8 quantization for training CNNs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3483–3491 (2021)

    Google Scholar 

  36. Zimmerer, D., Isensee, F., Petersen, J., Kohl, S., Maier-Hein, K.: Unsupervised anomaly localization using variational auto-encoders. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 289–297. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_32

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paula Ruiz-Barroso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ruiz-Barroso, P., Castro, F.M., Guil, N. (2023). Real-Time Unsupervised Object Localization on the Edge for Airport Video Surveillance. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36616-1_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36615-4

  • Online ISBN: 978-3-031-36616-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics