JU-VNT: a multi-spectral dataset of indoor object recognition using visible, near-infrared and thermal spectrum

Abstract

Detecting objects in natural scenes can be a very challenging task. In several real-life scenarios it is often found that visible spectrum is not ideal for typical computer vision tasks. Going beyond the range of visible light spectrum, such as the near infrared spectrum or the thermal spectrum allows us to capture many unique properties of objects that normally not captured with a normal camera. In this work we propose two multi-spectral dataset with three different spectrum, namely, the visible, near infrared and thermal spectrum. The first dataset is a single object dataset where we have common desk objects of 25 different categories comprising of various materials. The second dataset comprises of all possible combination using these 25 objects taking a pair at a time. The objects are captured from 8 different angles using the three different cameras. The images are registered and cropped and provided along with classification and localization ground truths. Additionally classification benchmarks have been provided using the ResNet, InceptionNet and DenseNet architectures on both the datasets. The dataset would be publicly available from https://github.com/DVLP-CMATERJU/JU-VNT.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Aguilera C, Soria X, Sappa AD, Toledo R (2017) Rgbn multispectral images: A novel color restoration approach. In: International conference on practical applications of agents and multi-agent systems. Springer, pp 155–163

  2. 2.

    Alldieck T, Bahnsen C, Moeslund T (2016) Context-aware fusion of rgb and thermal imagery for traffic monitoring. Sensors 16(11):1947

    Article  Google Scholar 

  3. 3.

    Ambinder M The secret team that killed bin laden. Nat J 3

  4. 4.

    Brown M, Süsstrunk S (2011) Multi-spectral sift for scene category recognition. In: CVPR 2011. IEEE, pp 177–184

  5. 5.

    Cheng Z, Shen J (2016) On very large scale test collection for landmark image search benchmarking. Signal Process 124:13–26

    Article  Google Scholar 

  6. 6.

    Choe G, Kim SH, Im S, Lee JY, Narasimhan SG, Kweon IS (2018) Ranus: Rgb and nir urban scene dataset for deep scene parsing. IEEE Robot Autom Lett 3(3):1808–1815

    Article  Google Scholar 

  7. 7.

    Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery. In: 2005 Seventh IEEE workshops on applications of computer vision (WACV/MOTION’05)-volume 1, vol 1. IEEE, pp 364–369

  8. 8.

    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: CVPR09

  9. 9.

    Di W, Zhang L, Zhang D, Pan Q (2010) Studies on hyperspectral face recognition in visible spectrum with feature band selection. IEEE Trans Sys Man Cybern Part A Sys Hum 40(6):1354–1361

    Article  Google Scholar 

  10. 10.

    Dutta A, Mandal B, Ghosh S, Das N (2020) Using thermal intensities to build conditional random fields for object segmentation at night. In: 2020 4Th international conference on computational intelligence and networks (CINE). IEEE, pp 1–6

  11. 11.

    Farley V, Vallières A, Villemaire A, Chamberland M, Lagueux P, Giroux J (2007) Chemical agent detection and identification with a hyperspectral imaging infrared sensor. In: Electro-optical remote sensing, detection, and photonic technologies and their applications, vol 6739. International Society for Optics and Photonics, p 673918

  12. 12.

    Ferwerda JG (2005) Charting the quality of forage: Measuring and mapping the variation of chemical components in foliage with hyperspectral remote sensing ITC

  13. 13.

    Flémal P, Pigeon O, Dardenne P, Pierna JF, Baeten V, Vermeulen P (2017) Assessment of pesticide coating on cereal seeds by near infrared hyperspectral imaging. J Spectral Imag 6

  14. 14.

    Gao L, Li X, Song J, Shen HT (2019) Hierarchical lstms with adaptive attention for visual captioning. IEEE Trans Pattern Anal Mach Intell 42 (5):1112–1131

    Google Scholar 

  15. 15.

    Gustafson GB, Wilcox CH (2012) Analytical and computational methods of advanced engineering mathematics, vol 28. Springer Science & Business Media, Berlin

    Google Scholar 

  16. 16.

    Ha Q, Watanabe K, Karasawa T, Ushiku Y, Harada T (2017) Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 5108–5115

  17. 17.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  18. 18.

    Higgins KT (2013) Five new technologies for inspection. Food Process 6

  19. 19.

    Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  20. 20.

    Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045

  21. 21.

    Krizhevsky A, Nair V, Hinton G Cifar-10 (Canadian institute for advanced research) http://www.cs.toronto.edu/kriz/cifar.html

  22. 22.

    Lacar F, Lewis M, Grierson I (2001) Use of hyperspectral imagery for mapping grape varieties in the barossa valley, south australia. In: IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Sym posium (Cat. No. 01CH37217), vol 6. IEEE, pp 2875–2877

  23. 23.

    LeCun Y, Bottou L, Bengio Y, Haffner P, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  24. 24.

    Li C, Liang X, Lu Y, Zhao N, Tang J (2018) Rgb-t object tracking:, benchmark and baseline. arXiv:1805.08982

  25. 25.

    Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

  26. 26.

    Shahidi A, Patel S, Flanagan J, Hudson C (2013) Regional variation in human retinal vessel oxygen saturation. Exp Eye Res 113:143–147

    Article  Google Scholar 

  27. 27.

    Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011

    MathSciNet  Article  Google Scholar 

  28. 28.

    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  29. 29.

    Takumi K, Watanabe K, Ha Q, Tejero-De-Pablos A, Ushiku Y, Harada T (2017) Multispectral object detection for autonomous vehicles. In: Proceedings of the on thematic workshops of ACM multimedia 2017. ACM, pp 35–43

  30. 30.

    Tilling AK, O’Leary G, Ferwerda J, Jones S, Fitzgerald G, Belford R (2006) Remote sensing to detect nitrogen and water stress in wheat. Australian Soc Agron 17

  31. 31.

    Wang L, Li R, Shi H, Sun J, Zhao L, Seah HS, Quah CK, Tandianus B (2019) Multi-channel convolutional neural network based 3d object detection for indoor robot environmental perception. Sensors 19(4):893

    Article  Google Scholar 

Download references

Acknowledgment

We would like to thank the entire team of people that made collection of this dataset possible.

- Image Capturing : Subham Jana, Puspendu Khan

- Data Preprocessing : Tathagata Bandyopadhyay, Anwesha Sen, Soupik Chowdhury

- Data Annotation : Osman Goni, Priyam Sarkar, Rounak Dutta, Supriyo Das, Debarati Roy, Ujjwal Misra

- Benchmarks : Somenath Kuiry, Bodhisatwa Mondal

This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University. Special thanks to Somenath Kuiry for his contribution during revision.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Nibaran Das.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University.

Appendix

Appendix

Some more samples corresponding to the various angles in which images were captured are shown in Fig. 9 and samples from the 25 different object categories and 3 different spectrums have been demonstrated in Fig. 10.

Fig. 9
figure9

Images from each object (single objects) or combination of objects (multiple objects) are captured from 8 different angles at rotated at an interval of 45

Fig. 10
figure10

Samples from 25 object classes and 3 spectrums

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ghosh, S., Das, N., Sarkar, P. et al. JU-VNT: a multi-spectral dataset of indoor object recognition using visible, near-infrared and thermal spectrum. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-020-10302-z

Download citation

Keywords

  • Multispectral image processing
  • Thermal image
  • Near infrared image
  • Deep learning