Abstract
Object localization is vital in computer vision to solve object detection or classification problems. Typically, this task is performed on expensive GPU devices, but edge computing is gaining importance in real-time applications. In this work, we propose a real-time implementation for unsupervised object localization using a low-power device for airport video surveillance. We automatically find regions of objects in video using a region proposal network (RPN) together with an optical flow region proposal (OFRP) based on optical flow maps between frames. In addition, we study the deployment of our solution on an embedded architecture, i.e. a Jetson AGX Xavier, using simultaneously CPU, GPU and specific hardware accelerators. Also, three different data representations (FP32, FP16 and INT8) are employed for the RPN. Obtained results show that optimizations can improve up to 4.1\(\times \) energy consumption and 2.2\(\times \) execution time while maintaining good accuracy with respect to the baseline model.
Supported by the Junta de Andalucía of Spain (P18-FR-3130 and UMA20-FEDERJA-059), the Ministry of Education of Spain (PID2019-105396RB-I00) and the University of Málaga (B1-2022_04).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
NVIDIA Nsight Compute documentation can be consulted: https://developer.nvidia.com/nsight-compute.
- 4.
NVIDIA TensorRT documentation can be consulted to find mapping incompatibilities: https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-841/developer-guide/index.html.
References
Ahmed, S., Bons, M.: Edge computed NILM: a phone-based implementation using mobilenet compressed by tensorflow lite. In: NILM, pp. 44–48 (2020)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR, pp. 6154–6162 (2018)
Chen, M., et al.: MoLoc: unsupervised fingerprint roaming for device-free indoor localization in a mobile ship environment. IEEE Internet Things J. 7(12), 11851–11862 (2020)
Chen, X., Li, H., Zhou, C., Liu, X., Wu, D., Dudek, G.: Fidora: robust wifi-based indoor localization via unsupervised domain adaptation. IEEE Internet Things J. 9(12), 9872–9888 (2022)
Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint:1710.09282 (2017)
David, R., et al.: Tensorflow lite micro: embedded machine learning for tinyml systems. PMLR 3, 800–811 (2021)
Deng, L., Li, G., Han, S., Shi, L., Xie, Y.: Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4), 485–532 (2020)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE PAMI 32(9), 1627–1645 (2010)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Gong, G., Wang, X., Mu, Y., Tian, Q.: Learning temporal co-attention models for unsupervised video action localization. In: CVPR, pp. 9819–9828 (2020)
Gudovskiy, D., Ishizaka, S., Kozuka, K.: CFLOW-AD: real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: WACV 2022, pp. 98–107
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Jeong, E., Kim, J., Tan, S., Lee, J., Ha, S.: Deep learning inference parallelization on heterogeneous processors with TensorRT. IEEE Embed. Syst. Lett. 14(1), 15–18 (2021)
Li, Y., Hu, X., Zhuang, Y., Gao, Z., Zhang, P., El-Sheimy, N.: Deep reinforcement learning (DRL): another perspective for unsupervised wireless localization. IEEE Internet Things J. 7(7), 6279–6287 (2019)
Liang, T., Glossner, J., Wang, L., Shi, S., Zhang, X.: Pruning and quantization for deep neural network acceleration: a survey. Neurocomputing 461, 370–403 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)
Liu, Z., Wang, Y., Han, K., Zhang, W., Ma, S., Gao, W.: Post-training quantization for vision transformer. NeurIPS 34, 28092–28103 (2021)
Ma, X., Ji, K., Xiong, B., Zhang, L., Feng, S., Kuang, G.: Light-yolov4: an edge-device oriented target detection method for remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 10808–10820 (2021)
Mathew, M., Desappan, K., Kumar Swami, P., Nagori, S.: Sparse, quantized, full frame cnn for low power embedded devices. In: CVPR (2017)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Park, E., Yoo, S., Vajda, P.: Value-aware quantization for training and inference of neural networks. In: ECCV, pp. 580–595 (2018)
Qasaimeh, M., et al.: Benchmarking vision kernels and neural network inference accelerators on embedded platforms. J. Syst. Architect. 113, 101896 (2021)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Ruiz-Barroso, P., Castro, F.M., Delgado-Escaño, R., Ramos-Cózar, J., Guil, N.: High performance inference of gait recognition models on embedded systems. Sustain. Comput. Inf. Syst. 36, 100814 (2022)
Saddik, A., Latif, R., Elhoseny, M., Elouardi, A.: Real-time evaluation of different indexes in precision agriculture using a heterogeneous embedded system. Sustain. Comput. Inf. Syst. 30, 100506 (2021)
Seichter, D., Köhler, M., Lewandowski, B., Wengefeld, T., Gross, H.M.: Efficient RGB-D semantic segmentation for indoor scene analysis. In: ICRA, pp. 13525–13531. IEEE (2021)
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. CVGIP 30(1), 32–46 (1985)
Tao, Z., Li, Q.: eSGD: commutation efficient distributed deep learning on the edge. HotEdge, 6 (2018)
Viola, P., Jones, M., et al.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1, pp. 511–518 (2001)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-yolov4: scaling cross stage partial network. In: CVPR, pp. 13029–13038 (2021)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv: 2207.02696 (2022)
Xia, X., et al.: TRT-ViT: tensorrt-oriented vision transformer. arXiv preprint:2205.09579 (2022)
Zhai, X., Kolesnikov, A., Houlsby, N., Beyer, L.: Scaling vision transformers. In: CVPR, pp. 12104–12113 (2022)
Zhao, K., et al.: Distribution adaptive int8 quantization for training CNNs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3483–3491 (2021)
Zimmerer, D., Isensee, F., Petersen, J., Kohl, S., Maier-Hein, K.: Unsupervised anomaly localization using variational auto-encoders. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 289–297. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_32
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ruiz-Barroso, P., Castro, F.M., Guil, N. (2023). Real-Time Unsupervised Object Localization on the Edge for Airport Video Surveillance. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-36616-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36615-4
Online ISBN: 978-3-031-36616-1
eBook Packages: Computer ScienceComputer Science (R0)