Abstract
Poor data quality impacts negatively on various classification approaches. Classification is part of an important aspect of research, and often there are missing values found in data, which poses a problem in critical decision-making in terms of accuracy. Often selecting the best approach to handle missing data can be a difficult task, as there are several conditions to consider. Remote sensing as a discipline is quite susceptible to missing data. The objective of this study is to evaluate the robustness and accuracy of four classifiers when dealing with the incomplete remote sensing data problem. Two remote sensing data sets are utilised for this task with a four-way repeated-measures design employed to analyse the results. Simulation results suggest k-nearest neighbour as a superior approach to handling missing data, especially when regression imputation is used. Most classifiers achieve lower accuracy when listwise deletion is used. Nonetheless, RF is much less robust to missing data compared to other classifiers such as ANN and SVM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Aksoy, K. Koperski, C. Tusk, G. Marchisio, Land cover classifcation with multi-sensor fusion of partly missing data. Photogram. Eng. Rem. Sens. 75(5), 557–593 (2009)
L. Baretta, A. Santaniello, Nearest neighbor imputation algorithms: a critical evaluation. BMC Med. Inf. Decis. Making 16 (2016). https://doi.org/10.1186/s12911-016-0318-z
G. Chechik, G. Heitz, G. Elidan, Max-margin classification of data with absent features. J. Mach. Learn. Res. 9, 1–21 (2000)
Y. Ding, J.S. Simonoff, An investigatio of missing data methods for classification trees. J. Mach. Learn. Res. 11, 131–170 (2010)
K.W. Hsu, Weight-adjusted bagging of classification algorithms sensitive to missing values. Int. J. Inf. Educ. Technol. 3(5) (2013)
B. Johnson, R. Tateishi, N. Hoan, A hybrid pansharpening approach and multiscale object-based image analysis for mapping deseased pine and oak trees. Int. J. Rem. Sens. 34(20), 6969–6982 (2013)
B. Johnson, R. Tateishi, Z. Xie, Using geographically-weighted variables for image classification. Rem. Sens. Lett. 3(6), 491–499 (2012)
W.M. Khedr, A.M. Elshewey, Pattern classification for incomplete data using PPCS and KNN. J. Emerg. Trends Comput. Inf. Sci. 4(8) (2013). ISSN 2079-8407
M. Lichman, UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. (University of California, School of Information and Computer Science, Irvine, CA, 2013)
R.J.A. Little, D.B. Rubin, Statistical Analysis with Missing Data (Wiley, New York, 1987)
G. Mountrakis, J. Im, C. Ogole, Support vector machines in remote sensing: a review. ISPRS J. Photogram. Rem. Sens. 66, 247–259 (2011)
T. Nkonyana, T. Twala, An empirical evaluation of machine learning algorithms for image classification, in Advances in Swarm Intelligence. ICSI 2016, ed. by Y. Tan, Y. Shi, L. Li. Lecture Notes in Computer Science, vol 9713 (Springer, Cham, 2016)
N.R. Pimplikar, A. Kumar, A.M. Gupta, Study of missing value imputation methods—acomparative approach. Int. J. Adv. Res. Comput. Sci. Sofw. Eng. 4(3) (2014). ISSN 2277 128X
B. Twala, T. Nkonyana, Extracting supervised learning classifiers from possibly incomplete remotely sensed data, in Computational Intelligence and 11th Brazil Congress on Computation Intelligence (BRICS-CCI & CBIC), 2013 Brics Congress. https://doi.org/10.1109/brics-cci-cnic.2013.85
L. Xuerong, X. Qianguo, K. Lingyan, Remote sensing image method based on evidence theory and decision tree. Proc. SPIE 7857, 7857Y-1 (2010)
Acknowledgements
This work was funded by the Institute for Intelligent Systems at the University of Johannesburg, South Africa. The authors would like to thank Ahmed Ali and anonymous reviewers for their useful comments and to the UCI for making the data sets available.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nkonyana, T., Twala, B. (2018). Impact of Poor Data Quality in Remotely Sensed Data. In: Dash, S., Naidu, P., Bayindir, R., Das, S. (eds) Artificial Intelligence and Evolutionary Computations in Engineering Systems. Advances in Intelligent Systems and Computing, vol 668. Springer, Singapore. https://doi.org/10.1007/978-981-10-7868-2_8
Download citation
DOI: https://doi.org/10.1007/978-981-10-7868-2_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7867-5
Online ISBN: 978-981-10-7868-2
eBook Packages: EngineeringEngineering (R0)