Abstract
Detecting objects in natural scenes can be a very challenging task. In several real-life scenarios it is often found that visible spectrum is not ideal for typical computer vision tasks. Going beyond the range of visible light spectrum, such as the near infrared spectrum or the thermal spectrum allows us to capture many unique properties of objects that normally not captured with a normal camera. In this work we propose two multi-spectral dataset with three different spectrum, namely, the visible, near infrared and thermal spectrum. The first dataset is a single object dataset where we have common desk objects of 25 different categories comprising of various materials. The second dataset comprises of all possible combination using these 25 objects taking a pair at a time. The objects are captured from 8 different angles using the three different cameras. The images are registered and cropped and provided along with classification and localization ground truths. Additionally classification benchmarks have been provided using the ResNet, InceptionNet and DenseNet architectures on both the datasets. The dataset would be publicly available from https://github.com/DVLP-CMATERJU/JU-VNT.
This is a preview of subscription content, access via your institution.








References
- 1.
Aguilera C, Soria X, Sappa AD, Toledo R (2017) Rgbn multispectral images: A novel color restoration approach. In: International conference on practical applications of agents and multi-agent systems. Springer, pp 155–163
- 2.
Alldieck T, Bahnsen C, Moeslund T (2016) Context-aware fusion of rgb and thermal imagery for traffic monitoring. Sensors 16(11):1947
- 3.
Ambinder M The secret team that killed bin laden. Nat J 3
- 4.
Brown M, Süsstrunk S (2011) Multi-spectral sift for scene category recognition. In: CVPR 2011. IEEE, pp 177–184
- 5.
Cheng Z, Shen J (2016) On very large scale test collection for landmark image search benchmarking. Signal Process 124:13–26
- 6.
Choe G, Kim SH, Im S, Lee JY, Narasimhan SG, Kweon IS (2018) Ranus: Rgb and nir urban scene dataset for deep scene parsing. IEEE Robot Autom Lett 3(3):1808–1815
- 7.
Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery. In: 2005 Seventh IEEE workshops on applications of computer vision (WACV/MOTION’05)-volume 1, vol 1. IEEE, pp 364–369
- 8.
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: CVPR09
- 9.
Di W, Zhang L, Zhang D, Pan Q (2010) Studies on hyperspectral face recognition in visible spectrum with feature band selection. IEEE Trans Sys Man Cybern Part A Sys Hum 40(6):1354–1361
- 10.
Dutta A, Mandal B, Ghosh S, Das N (2020) Using thermal intensities to build conditional random fields for object segmentation at night. In: 2020 4Th international conference on computational intelligence and networks (CINE). IEEE, pp 1–6
- 11.
Farley V, Vallières A, Villemaire A, Chamberland M, Lagueux P, Giroux J (2007) Chemical agent detection and identification with a hyperspectral imaging infrared sensor. In: Electro-optical remote sensing, detection, and photonic technologies and their applications, vol 6739. International Society for Optics and Photonics, p 673918
- 12.
Ferwerda JG (2005) Charting the quality of forage: Measuring and mapping the variation of chemical components in foliage with hyperspectral remote sensing ITC
- 13.
Flémal P, Pigeon O, Dardenne P, Pierna JF, Baeten V, Vermeulen P (2017) Assessment of pesticide coating on cereal seeds by near infrared hyperspectral imaging. J Spectral Imag 6
- 14.
Gao L, Li X, Song J, Shen HT (2019) Hierarchical lstms with adaptive attention for visual captioning. IEEE Trans Pattern Anal Mach Intell 42 (5):1112–1131
- 15.
Gustafson GB, Wilcox CH (2012) Analytical and computational methods of advanced engineering mathematics, vol 28. Springer Science & Business Media, Berlin
- 16.
Ha Q, Watanabe K, Karasawa T, Ushiku Y, Harada T (2017) Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 5108–5115
- 17.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
- 18.
Higgins KT (2013) Five new technologies for inspection. Food Process 6
- 19.
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
- 20.
Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045
- 21.
Krizhevsky A, Nair V, Hinton G Cifar-10 (Canadian institute for advanced research) http://www.cs.toronto.edu/kriz/cifar.html
- 22.
Lacar F, Lewis M, Grierson I (2001) Use of hyperspectral imagery for mapping grape varieties in the barossa valley, south australia. In: IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Sym posium (Cat. No. 01CH37217), vol 6. IEEE, pp 2875–2877
- 23.
LeCun Y, Bottou L, Bengio Y, Haffner P, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
- 24.
Li C, Liang X, Lu Y, Zhao N, Tang J (2018) Rgb-t object tracking:, benchmark and baseline. arXiv:1805.08982
- 25.
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
- 26.
Shahidi A, Patel S, Flanagan J, Hudson C (2013) Regional variation in human retinal vessel oxygen saturation. Exp Eye Res 113:143–147
- 27.
Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
- 28.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
- 29.
Takumi K, Watanabe K, Ha Q, Tejero-De-Pablos A, Ushiku Y, Harada T (2017) Multispectral object detection for autonomous vehicles. In: Proceedings of the on thematic workshops of ACM multimedia 2017. ACM, pp 35–43
- 30.
Tilling AK, O’Leary G, Ferwerda J, Jones S, Fitzgerald G, Belford R (2006) Remote sensing to detect nitrogen and water stress in wheat. Australian Soc Agron 17
- 31.
Wang L, Li R, Shi H, Sun J, Zhao L, Seah HS, Quah CK, Tandianus B (2019) Multi-channel convolutional neural network based 3d object detection for indoor robot environmental perception. Sensors 19(4):893
Acknowledgment
We would like to thank the entire team of people that made collection of this dataset possible.
- Image Capturing : Subham Jana, Puspendu Khan
- Data Preprocessing : Tathagata Bandyopadhyay, Anwesha Sen, Soupik Chowdhury
- Data Annotation : Osman Goni, Priyam Sarkar, Rounak Dutta, Supriyo Das, Debarati Roy, Ujjwal Misra
- Benchmarks : Somenath Kuiry, Bodhisatwa Mondal
This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University. Special thanks to Somenath Kuiry for his contribution during revision.
Author information
Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University.
Rights and permissions
About this article
Cite this article
Ghosh, S., Das, N., Sarkar, P. et al. JU-VNT: a multi-spectral dataset of indoor object recognition using visible, near-infrared and thermal spectrum. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-020-10302-z
Received:
Revised:
Accepted:
Published:
Keywords
- Multispectral image processing
- Thermal image
- Near infrared image
- Deep learning