Detecting objects in natural scenes can be a very challenging task. In several real-life scenarios it is often found that visible spectrum is not ideal for typical computer vision tasks. Going beyond the range of visible light spectrum, such as the near infrared spectrum or the thermal spectrum allows us to capture many unique properties of objects that normally not captured with a normal camera. In this work we propose two multi-spectral dataset with three different spectrum, namely, the visible, near infrared and thermal spectrum. The first dataset is a single object dataset where we have common desk objects of 25 different categories comprising of various materials. The second dataset comprises of all possible combination using these 25 objects taking a pair at a time. The objects are captured from 8 different angles using the three different cameras. The images are registered and cropped and provided along with classification and localization ground truths. Additionally classification benchmarks have been provided using the ResNet, InceptionNet and DenseNet architectures on both the datasets. The dataset would be publicly available from https://github.com/DVLP-CMATERJU/JU-VNT.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Aguilera C, Soria X, Sappa AD, Toledo R (2017) Rgbn multispectral images: A novel color restoration approach. In: International conference on practical applications of agents and multi-agent systems. Springer, pp 155–163
Alldieck T, Bahnsen C, Moeslund T (2016) Context-aware fusion of rgb and thermal imagery for traffic monitoring. Sensors 16(11):1947
Ambinder M The secret team that killed bin laden. Nat J 3
Brown M, Süsstrunk S (2011) Multi-spectral sift for scene category recognition. In: CVPR 2011. IEEE, pp 177–184
Cheng Z, Shen J (2016) On very large scale test collection for landmark image search benchmarking. Signal Process 124:13–26
Choe G, Kim SH, Im S, Lee JY, Narasimhan SG, Kweon IS (2018) Ranus: Rgb and nir urban scene dataset for deep scene parsing. IEEE Robot Autom Lett 3(3):1808–1815
Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery. In: 2005 Seventh IEEE workshops on applications of computer vision (WACV/MOTION’05)-volume 1, vol 1. IEEE, pp 364–369
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: CVPR09
Di W, Zhang L, Zhang D, Pan Q (2010) Studies on hyperspectral face recognition in visible spectrum with feature band selection. IEEE Trans Sys Man Cybern Part A Sys Hum 40(6):1354–1361
Dutta A, Mandal B, Ghosh S, Das N (2020) Using thermal intensities to build conditional random fields for object segmentation at night. In: 2020 4Th international conference on computational intelligence and networks (CINE). IEEE, pp 1–6
Farley V, Vallières A, Villemaire A, Chamberland M, Lagueux P, Giroux J (2007) Chemical agent detection and identification with a hyperspectral imaging infrared sensor. In: Electro-optical remote sensing, detection, and photonic technologies and their applications, vol 6739. International Society for Optics and Photonics, p 673918
Ferwerda JG (2005) Charting the quality of forage: Measuring and mapping the variation of chemical components in foliage with hyperspectral remote sensing ITC
Flémal P, Pigeon O, Dardenne P, Pierna JF, Baeten V, Vermeulen P (2017) Assessment of pesticide coating on cereal seeds by near infrared hyperspectral imaging. J Spectral Imag 6
Gao L, Li X, Song J, Shen HT (2019) Hierarchical lstms with adaptive attention for visual captioning. IEEE Trans Pattern Anal Mach Intell 42 (5):1112–1131
Gustafson GB, Wilcox CH (2012) Analytical and computational methods of advanced engineering mathematics, vol 28. Springer Science & Business Media, Berlin
Ha Q, Watanabe K, Karasawa T, Ushiku Y, Harada T (2017) Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 5108–5115
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Higgins KT (2013) Five new technologies for inspection. Food Process 6
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045
Krizhevsky A, Nair V, Hinton G Cifar-10 (Canadian institute for advanced research) http://www.cs.toronto.edu/kriz/cifar.html
Lacar F, Lewis M, Grierson I (2001) Use of hyperspectral imagery for mapping grape varieties in the barossa valley, south australia. In: IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Sym posium (Cat. No. 01CH37217), vol 6. IEEE, pp 2875–2877
LeCun Y, Bottou L, Bengio Y, Haffner P, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li C, Liang X, Lu Y, Zhao N, Tang J (2018) Rgb-t object tracking:, benchmark and baseline. arXiv:1805.08982
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Shahidi A, Patel S, Flanagan J, Hudson C (2013) Regional variation in human retinal vessel oxygen saturation. Exp Eye Res 113:143–147
Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Takumi K, Watanabe K, Ha Q, Tejero-De-Pablos A, Ushiku Y, Harada T (2017) Multispectral object detection for autonomous vehicles. In: Proceedings of the on thematic workshops of ACM multimedia 2017. ACM, pp 35–43
Tilling AK, O’Leary G, Ferwerda J, Jones S, Fitzgerald G, Belford R (2006) Remote sensing to detect nitrogen and water stress in wheat. Australian Soc Agron 17
Wang L, Li R, Shi H, Sun J, Zhao L, Seah HS, Quah CK, Tandianus B (2019) Multi-channel convolutional neural network based 3d object detection for indoor robot environmental perception. Sensors 19(4):893
We would like to thank the entire team of people that made collection of this dataset possible.
- Image Capturing : Subham Jana, Puspendu Khan
- Data Preprocessing : Tathagata Bandyopadhyay, Anwesha Sen, Soupik Chowdhury
- Data Annotation : Osman Goni, Priyam Sarkar, Rounak Dutta, Supriyo Das, Debarati Roy, Ujjwal Misra
- Benchmarks : Somenath Kuiry, Bodhisatwa Mondal
This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University. Special thanks to Somenath Kuiry for his contribution during revision.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016), and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University.
About this article
Cite this article
Ghosh, S., Das, N., Sarkar, P. et al. JU-VNT: a multi-spectral dataset of indoor object recognition using visible, near-infrared and thermal spectrum. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-020-10302-z
- Multispectral image processing
- Thermal image
- Near infrared image
- Deep learning