Refining Training Samples Using Median Absolute Deviation for Supervised Classification of Remote Sensing Images

  • Xunqiang Gong
  • Li ShenEmail author
  • Tieding Lu
Research Article


Supervised image classification refers to the task of extracting information classes from a multi-band remote sensing image. The selection of training samples is critical and directly influences supervised classification accuracy. However, some impure training samples are possible selected because of human mistakes or limited labeling conditions, which leads to a reduction in the classification accuracy. To solve this issue, median absolute deviation (MAD) is adopted to refine training samples. A comparison of the full and refined training samples is conducted for the same classifier, i.e., maximum likelihood classification (MLC) or support vector machine (SVM), through experimental evaluation with two sets of experiments. The results of experiments show that the overall accuracy and the kappa coefficient of the refined training samples significantly outperform those of the full training samples for the same classifier (MLC or SVM). It shows that refining training samples using the MAD can effectively eliminate the influence of impure training samples so that the more reliable and accurate results can be obtained.


Supervised image classification Refining training samples Outlier detection Median absolute deviation 



This work was jointly supported by the National Key Research and Development Plan of China [Grand 2016YFB0501403; Grand 2016YFB0501405], the Doctoral Scientific Research Foundation of East China University of Technology [DHBK2017158] and the Key Laboratory for Digital Land and Resources of Jiangxi Province [DLLJ201808]. The first author was financially supported by the China Scholarship Council (CSC) for his study at the Technical University of Berlin, Germany, with Prof. Frank Neitzel.


  1. Abellán, J., & Masegosa, A. R. (2012). Bagging schemes on the presence of class noise in classification. Expert Systems with Applications, 39(8), 6827–6837.Google Scholar
  2. Angelova, A., Abu-Mostafam, Y., & Perona, P. (2005). Pruning training sets for learning of object categories. In Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (Vol. 6, pp. 20–25), San Diego, CA, USA.Google Scholar
  3. Anselin, L. (1995). Local indicators of spatial association-LISA. Geographical Analysis, 27(2), 93–115.Google Scholar
  4. Blaschke, T. (2010). Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing, 65(1), 2–16.Google Scholar
  5. Brodley, C. E., & Friedl, M. A. (1999). Identifying mislabeled training data. Journal of Artificial Intelligence Research, 11(1), 131–167.Google Scholar
  6. Büschenfeld, T., & Ostermann, J. (2012). Automatic refinement of training data for classification of satellite imagery. In ISPRS annals of the photogrammetry, remote sensing and spatial information sciences (Vol. I-7, pp. 117–122).Google Scholar
  7. Cerioli, A. (2010). Multivariate outlier detection with high-breakdown estimators. Journal of the American Statistical Association, 105(489), 147–156.Google Scholar
  8. Chawla, S., & Sun, P. (2006). SLOM: A new measure for local spatial outliers. Knowledge and Information Systems, 9(4), 412–429.Google Scholar
  9. Chellasamy, M., Ferré, T. P. A., & Greve, M. H. (2015). An ensemble-based training data refinement for automatic crop discrimination using WorldView-2 imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(10), 4882–4894.Google Scholar
  10. Congalton, R. G. (1991). A review of assessing the accuracy of classifications of remotely sensed data. Remote Sensing of Environment, 37(2), 35–46.Google Scholar
  11. Cousineau, D., & Chartier, S. (2010). Outliers detection and treatment: A review. International Journal of Psychological Research, 3(1), 58–67.Google Scholar
  12. Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2), 139–157.Google Scholar
  13. Du, P., Chen, Y., Xia, J., & Tan, K. (2013). A novel remote sensing image classification scheme based on data fusion, multiple features and ensemble learning. Journal of the Indian Society of Remote Sensing, 41(2), 213–222.Google Scholar
  14. Egorov, A. V., Hansen, M. C., Roy, D. P., Kommareddy, A., & Potapov, P. V. (2015). Image interpretation-guided supervised classification using nested segmentation. Remote Sensing of Environment, 165, 135–147.Google Scholar
  15. Fitzgerald, R. W., & Lees, B. G. (1994). Assessing the classification accuracy of multisource remote sensing data. Remote Sensing of Environment, 47(3), 362–368.Google Scholar
  16. Foody, G. M. (2002). Status of land cover classification accuracy assessment. Remote Sensing of Environment, 80(1), 185–201.Google Scholar
  17. Foody, G. M., & Mathur, A. (2006). The use of small training sets containing mixed pixels for accurate hard image classification: Training on mixed spectral responses for classification by a SVM. Remote Sensing of Environment, 103(2), 179–189.Google Scholar
  18. Frénay, B., & Verleysen, M. (2014). Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25(5), 845–869.Google Scholar
  19. Glenday, J. (2008). Carbon storage and emissions offset potential in an African dry forest, the Arabuko-Sokoke forest, Kenya. Environmental Monitoring and Assessment, 142(1–3), 85–95.Google Scholar
  20. Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the American Statistical Association, 69(346), 383–393.Google Scholar
  21. Hsu, P. P., Kang, S. A., Rameseder, J., et al. (2011). The mTOR-regulated phosphoproteome reveals a mechanism of mTORC1-mediated inhibition of growth factor signaling. Science, 332, 1317–1322.Google Scholar
  22. Huang, C., Davis, L. S., & Townshend, J. R. G. (2002). An assessment of support vector machines for land cover classification. International Journal of Remote Sensing, 23(4), 725–749.Google Scholar
  23. Huang, X., Weng, C., Lu, Q., Feng, T., & Zhang, L. (2015). Automatic labelling and selection of training samples for high-resolution remote sensing image classification over urban areas. Remote Sensing, 7(12), 16024–16044.Google Scholar
  24. Huber, P. J. (2011). Robust statistics (pp. 1248–1251). Berlin: Springer.Google Scholar
  25. Jeatrakul, P., Wong, K. W., & Fung, C. C. (2010). Data cleaning for classification using misclassification analysis. Journal of Advanced Computational Intelligence and Intelligent Informatics, 14(3), 297–302.Google Scholar
  26. Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766.Google Scholar
  27. Liu, C., Frazier, P., & Kumar, L. (2007). Comparative assessment of the measures of thematic classification accuracy. Remote Sensing of Environment, 107(4), 606–616.Google Scholar
  28. Lu, D., & Weng, Q. (2007). A survey of image classification methods and techniques for improving classification performance. International Journal of Remote Sensing, 28(5), 823–870.Google Scholar
  29. Ma, L., Fu, T., & Li, M. (2018). Active learning for object-based image classification using predefined training objects. International Journal of Remote Sensing, 39(9), 2746–2765.Google Scholar
  30. Ma, L., Li, M., Ma, X., Cheng, L., Du, P., & Liu, Y. (2017). A review of supervised object-based land-cover image classification. ISPRS Journal of Photogrammetry and Remote Sensing, 130, 277–293.Google Scholar
  31. Otukei, J. R., & Blaschke, T. (2010). Land cover change assessment using decision trees, support vector machines and maximum likelihood classification algorithms. International Journal of Applied Earth Observation and Geoinformation, 12(1), S27–S31.Google Scholar
  32. Pelletier, C., Valero, S., Inglada, J., Champion, N., Sicre, C. M., & Dedieu, G. (2017). Effect of training class label noise on classification performances for land cover mapping with satellite image time series. Remote Sensing, 9(2), 73.Google Scholar
  33. Pradhan, B. (2013). A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Computers & Geosciences, 51(2), 350–365.Google Scholar
  34. Ridd, M. K., & Liu, J. (1998). A comparison of four algorithms for change detection in an urban environment. Remote Sensing of Environment, 63(2), 95–100.Google Scholar
  35. Rousseeuw, P. J., & Croux, C. (1993). Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424), 1273–1283.Google Scholar
  36. Rousseeuw, P. J., & Hubert, M. (2011). Robust statistics for outlier detection. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 1(1), 73–79.Google Scholar
  37. Shen, L., Wu, L., Dai, Y., Qiao, W., & Wang, Y. (2017). Topic modelling for object-based unsupervised classification of VHR panchromatic satellite images based on multiscale image segmentation. Remote Sensing, 9(8), 840.Google Scholar
  38. Song, C., Woodcock, C. E., Seto, K. C., Lenney, M. P., & Macomber, S. A. (2001). Classification and change detection using landsat TM data: When and how to correct atmospheric effects? Remote Sensing of Environment, 75(2), 230–244.Google Scholar
  39. Venkateswaran, K., Kasthuri, N., & Alaguraja, R. A. (2015). Performance comparison of wavelet and contourlet frame based features for improving classification accuracy in remote sensing images. Journal of the Indian Society of Remote Sensing, 43(4), 729–737.Google Scholar
  40. Weiss, M. S., & Hilgenfeld, R. (1997). On the use of the merging R factor as a quality indicator for X-ray data. Journal of Applied Crystallography, 30(2), 203–205.Google Scholar
  41. Zhang, Y., Zhang, H., & Lin, H. (2014). Improving the impervious surface estimation with combined use of optical and SAR remote sensing images. Remote Sensing of Environment, 141(2), 155–167.Google Scholar
  42. Zhu, C., Yang, S., Zhao, Q., Cui, S., & Wen, N. (2014). Robust semi-supervised kernel-FCM algorithm incorporating local spatial information for remote sensing image classification. Journal of the Indian Society of Remote Sensing, 42(1), 35–49.Google Scholar

Copyright information

© Indian Society of Remote Sensing 2018

Authors and Affiliations

  1. 1.Faculty of GeomaticsEast China University of TechnologyNanchangPeople’s Republic of China
  2. 2.State-Province Joint Engineering Laboratory of Spatial Information Technology for High-Speed Railway SafetySouthwest Jiaotong UniversityChengduPeople’s Republic of China

Personalised recommendations