In the recent days, scene understanding has become hot research topic due to its real usage at perceiving, analyzing and recognizing different dynamic scenes coverage during GPS monitoring system, drone’s targets, auto-driving and tourist guide. The goal of scene understanding is to make machines look at like humans do, which means the accurate recognition of the contents in scenes and during location observations. Then, we perform two operations such as (1) to perfectly describe the whole environment and (2) to describe what action is going on in the environment. Due to complex scene analysis, recognition of multiple objects and the relation between the objects remain as a challenging part of the research. In this paper, we have proposed a novel approach for the scene understanding that integrates multiple objects detection/segmentation and scene labeling using Geometric features, Histogram of oriented gradient and scale invariant feature transform descriptors. The complete procedure of the purposed model includes resizing and noise removing of images from the dataset, multiple object segmentation and detection, feature extraction and multiple object recognition using multi-layer kernel sliding perceptron. After that, scene recognition is achieved by using multi-class logistic regression. Finally, two datasets such as MSRC and UIUC sports are used for the experimental evaluation of our proposed method. Our purposed method accurately handles the complex objects physical exclusion and objects occlusion. Therefore, it outperforms in term of accuracy compared with other state-of-the-art approaches.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Osterland S, Weber J (2019) Analytical analysis of single-stage pressure relief valves. Int J Hydromechatron 2:32–53
Tingting Y, Junqian W, Lintai W, Yong X (2019) Three-stage network for age estimation. CAAI Trans Intell Technol 4(2):122–126
Susan S, Agrawal P, Mittal M, Bansal S (2019) New shape descriptor in the context of edge continuity. CAAI Trans Intell Technol 4(2):101–109
Zhu C, Miao D (2019) Influence of kernel clustering on an RBFN. CAAI Trans Intell Technol 4(4):255–260
Ahmed A, Jalal A, Kim K (2020) RGB-D images for object segmentation, localization and recognition in indoor scenes using feature descriptor and Hough voting. In: proceedings of IBCAST, pp 290–295
Jalal A, Kim YH, Kim YJ, Kamal S, Kim D (2017) Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit 61:295–308
Zhao M, Zhan C, Wu Z, Tang P (2018) Semi-supervised image classification based on local and global regression. IEEE Signal Process Lett 10:1666–1670
Heitz G, Gould S, Saxena A, Koller D (2009) Cascaded classification models: combining models for holistic scene understanding. In: Advances in neural information processing systems, pp 641–648
Shokri M, Tavakoli K (2019) A review on the artificial neural network approach to analysis and prediction of seismic damage in infrastructure. Int J Hydromechatron 2:178–196
Almira GA, Harsono T, Sigit R, Bimantara IGNTB, Saputra JM (2016) Performance analysis of Gaussian and bilateral filter in case of determination the fetal length. In: Proceedings of KCIC, pp 246–252
Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image catsegorization and segmentation. In: Proceedings of CVPR, pp 1–8
Mahmood M, Jalal A, Kim K (2019) WHITE STAG model: wise human interaction tracking and estimation (WHITE) using spatio-temporal and angular-geometric (STAG) descriptors. Multimedia Tools Appl 79:6919–6950
Xu J, Ramos S, Vázquez D, López AM (2014) Domain adaptation of deformable part-based models. IEEE Trans Pattern Anal Mach Intell 36(12):2367–2380
Wiens T (2019) Engine speed reduction for hydraulic machinery using predictive algorithms. Int. J. Hydromechatron. 2:16–31
Lai Z, Deng H (2018) Medical image classification based on deep features extracted by deep model and statistic feature fusion with multilayer perceptron. Comput Intell Neurosci 2018:1–13
Quaid MAK, Jalal A (2020) Wearable sensors based human behavioral pattern recognition using statistical features and reweighted genetic algorithm. Multimedia Tools Appl 79:6061–6083
Xu P, Davoine F, Denoeux T (2015) Evidential multinomial logistic regression for multiclass classifier calibration. In: Proceedings of IEEE conference on information fusion, pp. 1106–1112, 2015.
Shotton J, Winn J, Rother C, Criminisi A (2006) Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of European conference on computer vision, pp 1–15
Li LJ, Fei-Fei L (2007) What, where and who? Classifying event by scene and object recognition. In: Proceedings of IEEE conference on computer vision, pp. 1–8, 2007.
Irie G, Liu D, Li Z, Chang SF (2013) A bayesian approach to multimodal visual dictionary learning. In: Proceedings of CVPR, pp. 329–336, 2013.
Mottaghi R, Fidler S, Yuille A, Urtasun R, Parikh D (2015) Human-machine CRFs for identifying bottlenecks in scene understanding. IEEE Trans Pattern Anal Mach Intell 38(1):74–87
Liu X, Yang W, Lin L, Wang Q, Cai Z, Lai J (2015) Data-driven scene understanding with adaptively retrieved exemplars. Multimedia 22(3):82–92
Du L, Ren L, Dunson D, Carin L (2009) A Bayesian model for simultaneous image clustering, annotation and object segmentation. Adv Neural Inf Process Syst 22:486–494
Rafique AA, Jalal A, Ahmed A (2019) Scene understanding and recognition: statistical segmented model using geometrical features and Gaussian Naïve Bayes. In: Proceedings of ICAEM, pp.225–230
Feng J, Fu A (2018) Scene semantic recognition based on probability topic model. Information 9(4):1–13
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (No. 2018R1D1A1A02085645). Also, this work was supported by the Korea Medical Device Development Fund grant funded by the Korea government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare, the Ministry of Food and Drug Safety) (Project Number: 202012D05-02).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ahmed, A., Jalal, A. & Kim, K. Multi-objects Detection and Segmentation for Scene Understanding Based on Texton Forest and Kernel Sliding Perceptron. J. Electr. Eng. Technol. (2021). https://doi.org/10.1007/s42835-020-00650-z
- Logistic regression
- Multi-layer perceptron
- Neural network
- Scene understanding
- Texton forest segmentation