Algorithmic Approach to the Identification of Classification Rules or Separation Surface for Spatial Data

  • Yee LeungEmail author
Part of the Advances in Spatial Science book series (ADVSPATIAL)


As discussed in Chap. 3, naïve Bayes, LDA, logistic regression, and support vector machine are statistical or statistics related models developed for the classification of data. Breaking away from the statistical tradition is a number of classifiers which are algorithmic in nature. Instead of assuming a data model which is essential to the conventional statistical methods, these algorithmic classifiers attempt to work directly on the data without making any assumption about them. It has been regarded by many, particularly in the pattern recognition and artificial intelligence communities, as a more flexible approach to discover how data should be classified. Decision trees (or classification trees in the context of classification), neural networks, genetic algorithms, fuzzy sets, rough sets are typical paradigms. They are in general algorithmic in nature. In place of searching for a separation surface, like the statistical classifiers, some of these methods attempt to discover classification rules that can appropriately partition the feature space with reference to pre-specified classes. A decision tree is a segmentation of a training data set (Quinlan 1986; Friedman 1977). It is built by considering all objects as a single group, with the top node serving as the root of the tree. Training examples are then passed down the tree by splitting each intermediate node with respect to a variable. A decision tree is constructed when a certain stopping criterion is met. Each leaf, terminal, node of the tree contains a decision label, e.g., a class label. The decision tree partitions the feature space into sub-spaces corresponding to the leaves. Specifically, a decision tree that handles classification is known as a classification tree and a decision tree that solves regression problems is called a regression tree (Breiman et al. 1984). A decision tree that deals with both the classification and regression problems is referred to as a classification and regression tree (Breiman et al. 1984). Decision tree algorithms differ mainly in terms of their splitting and pruning strategies. They usually aim at the optimal partitioning of the feature space by minimizing the generalization error. The advantages of the decision tree approach are that it does not need any assumptions about the underlying distribution of the data, and it can handle both discrete and continuous variables. Furthermore, decision trees are easy to construct and interpret if they are of reasonable size and complexity. Their disadvantages are that splitting and pruning rules can be rather subjective. The theory is not as rigorous in terms of the statistical tradition. They also suffer from combinatorial explosion if the number of variables and their value labels are not appropriately controlled. Typical decision tree methods are ID3 (Quinlan 1986), C4.5 (Quinlan 1993), CART (Breiman et al. 1984), CHAID (Kass 1980), QUEST and newer versions, and FACT (Loh and Vanichsetakul 1988).


  1. Ahlqvist O (2005) Using uncertain conceptual spaces to translate between land cover categories. Int J Geogr Inform Sci 19:831–857CrossRefGoogle Scholar
  2. Ahlqvist O, Keukelaar J, Oukbir K (2000) Rough classification and accuracy assessment. Int J Geogr Inform Sci 14:475–496CrossRefGoogle Scholar
  3. Aldridge CH (1998) A theory of empirical spatial knowledge supporting rough set based knowledge discovery in geographical databases. Ph.D. thesis, University of Otago, Dunedin, New ZealandGoogle Scholar
  4. Amari S (1995) Information geometry of the EM and EM algorithms for neural. Neural Network 8(9):1379–1409CrossRefGoogle Scholar
  5. Arbib MA (ed) (1995) The handbook of Brain Theory and Neural Networks. MIT, CambridgeGoogle Scholar
  6. Atkinson PM, Curran PJ (1997) Choosing an appropriate spatial resolution for remote sensing investigations. Photogramm Eng Rem Sens 63(12):1345–1351Google Scholar
  7. Atkinson PM, Tatnall ARL (1997) Neural networks in remote sensing. Int J Remote Sens 18(4):699–709CrossRefGoogle Scholar
  8. Benediktsson JA, Swain PH, Ersoy OK (1990) Neural network approaches versus statistical methods in classification of multi-source remote sensing data. IEEE Trans Geosci Rem Sens 28(4):540–552CrossRefGoogle Scholar
  9. Bischof H, Schneider W, Pinz AJ (1992) Multi-spectral classification of landsat images using neural network. IEEE Trans Geosci Rem Sens 30:482–490CrossRefGoogle Scholar
  10. Bishop CM (1995a) Neural networks for pattern recognition. Clarendon Press, OxfordGoogle Scholar
  11. Bittner T, Stell JG (2002) Vagueness and rough location. Geoinformatica 6:99–121CrossRefGoogle Scholar
  12. Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefGoogle Scholar
  13. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, CaliforniaGoogle Scholar
  14. Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Rem Sens 37(2):551–559CrossRefGoogle Scholar
  15. Bruzzone L, Prieto DF (2000) Automatic analysis of the difference image for unsupervised change detection. IEEE Trans Geosci Rem Sens 38(3):1171–1182CrossRefGoogle Scholar
  16. Cao Z, Kandel A, Li L (1990) A new model of fuzzy reasoning. Fuzzy Set Syst 36:311–325CrossRefGoogle Scholar
  17. Carpenter GA, Grossberg S (1988) The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21:77–88CrossRefGoogle Scholar
  18. Carpenter GA, Grossberg S, Reynolds JH (1991) ARTMAP: supervised real time learning and classification of nonstationary data by a self-organising neural network. Neural Networks 4:565–588CrossRefGoogle Scholar
  19. Chen T, Chen H (1995) Approximation capability to functions of several variables, nonlinear functions, and operators by radial basis function neural networks. IEEE Trans Neural Network 6:904–910CrossRefGoogle Scholar
  20. Chmielewski MR, Grzymala-Busse JW (1996) Global discretization of continuous attributes as preprocessing for machine learning. Int J Approx Reason 15:319–331CrossRefGoogle Scholar
  21. Civco DL (1993) Artificial neural networks for land cover classification and mapping. Int J Geogr Inform Syst 7:173–186CrossRefGoogle Scholar
  22. Coren S, Ward L, Enns J (1994) Sensation and perception. Harcourt Brace College Publishers, Fort Worth, TXGoogle Scholar
  23. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via EM algorithm. J Roy Stat Soc B 39:1–38Google Scholar
  24. Dubois D, Prade H (1980) Fuzzy sets and systems: theory and applications. Academic, OrlandoGoogle Scholar
  25. Eiumnoh A, Shrestha RP (2000) Application of DEM data to landsat image classification: evaluation in a tropical wet-dry landscape of Thailand. Photogramm Eng Rem Sens 66(3):297–304Google Scholar
  26. Feldman DS (1993) Fuzzy network synthesis with genetic algorithms. In: Forrest S (ed) Proceedings of the 5th International conference on genetic algorithms. Morgan Kaufmann, San Mateo, CA, pp 312–317Google Scholar
  27. Ferro CJS, Warner TA (2002) Scale and texture in digital image classification. Photogramm Eng Rem Sens 68(1):51–63Google Scholar
  28. Fischer MM, Getis A (eds) (1997) Recent developments in spatial analysis: spatial statistics, behavioural modeling, and computational intelligence. Springer, BerlinGoogle Scholar
  29. Fischer MM, Leung Y (1998) A genetic-algorithms based evolutionary computational neural network for modelling spatial interaction data. Ann Reg Sci 32:295–298CrossRefGoogle Scholar
  30. Foody GM (1995a) Land cover classification using and artificial neural network with ancillary information. Int J Geogr Inform Syst 9:527–542CrossRefGoogle Scholar
  31. Friedman JH (1977) A recursive partitioning decision rule for nonparametric classification. IEEE Trans Comput C-26:404–408CrossRefGoogle Scholar
  32. Fu L (1994) Neural networks in computer intelligence. McGraw-Hill, New YorkGoogle Scholar
  33. Fukunaga K, Hayes RR (1989) Estimation of classifier performance. IEEE Trans Pattern Anal Mach Intell 11:1087–1101CrossRefGoogle Scholar
  34. Fung T (2003) Landscape dynamics in the Maipo Ramsar Wetland site. In Roy PS (ed) Geoinformatics for tropical ecosystems. Asian association of remote sensing. Bishen Singh Mahendra Pal Singh, Dehradun, India pp. 539–553Google Scholar
  35. Fung T, Leung Y, Xu ZB (2007) A vision-based approach to remote sensing image classification (a research project funded by the Hong Kong Research Grants Council)Google Scholar
  36. Gao Y, Leung Y, Xu ZB (1996) A new genetic algorithm with no genetic operators (unpublished paper)Google Scholar
  37. Girosi F (1994) Regulation theory, radial basis functions, and networks. In: Cherkassky V, Friedman JH (eds) From statistics to neural networks – Theory and pattern recognition applications. Springer, Germany, pp 166–187Google Scholar
  38. Goldberg DE (1989) Genetic algorithms in search optimization and machine learning. Addison-Wesley, New YorkGoogle Scholar
  39. Gomm JB, Yu D (2000) Selecting radial basis function network centers with recursive orthogonal least squares training. IEEE Trans Neural Network 11(2):306–314CrossRefGoogle Scholar
  40. Gong P (1996) Integrated analysis of spatial data from multiple sources: using evidential reasoning and artificial neural network techniques for geological mapping. Photogramm Eng Rem Sens 62(5):513–523Google Scholar
  41. Gong P, Pu R, Chen J (1996) Mapping ecological land systems and classification uncertainties from digital elevation and forest-cover data using neural networks. Photogramm Eng Rem Sens 62(11):1249–1260Google Scholar
  42. Gopal S, Fischer MM (1997) Fuzzy ARTMAP – A neural classifier for multi-spectral image classification. In: Fischer MM, Getis A (eds) Recent developments in spatial analysis. Berlin, Spinger, pp 306–335Google Scholar
  43. Grossberg S (1976) Adaptive pattern classification and universal recording. I: Parallel development and coding neural feature detectors. Biol Cybern 23:121–134CrossRefGoogle Scholar
  44. Hand DJ (1986) Recent advances in error rate estimation. Pattern Recogn Lett 4:335–346CrossRefGoogle Scholar
  45. Heermann PD, Khazenie N (1992) Classification of multi-spectral remote sensing data using a back-propagation neural network. IEEE Trans Geosci Rem Sens 30(1):81–88CrossRefGoogle Scholar
  46. Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann ArborGoogle Scholar
  47. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 81:3088–3092CrossRefGoogle Scholar
  48. Hummel R, Moniot R (1989) Reconstructions form zero crossings in scale space. IEEE Trans Acoust Speech Signal Process 37(12):245–295CrossRefGoogle Scholar
  49. Ishibuchi H, Nozaki K, Yamamoto N, Tanaka H (1995) Selecting fuzzy if-then rules for classification problems using genetic algorithms. IEEE Trans Fuzzy Syst 3(3):260–270CrossRefGoogle Scholar
  50. Jenson JR (1996) Introductory to digital image processing: a remote sensing perspective. Prentice Hall, Upper Saddle River, NJGoogle Scholar
  51. Jenson JR, Langari R (1999) Fuzzy logic: intelligence, control and information. Prentice Hall, Upper Saddle River, NJGoogle Scholar
  52. Ji M (2003) Using fuzzy sets to improve cluster labeling in unsupervised classification. Int J Rem Sens 24:657–671CrossRefGoogle Scholar
  53. Karr L (1991) Design of an adaptive fuzzy logic controller using a genetic algorithms. In: Belew RK, Booker LB (eds) Proceedings of the 4th International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA, pp 450–457Google Scholar
  54. Knoke JD (1986) The robust estimation of classification error rates. Comput Math Appl 12A:253–260CrossRefGoogle Scholar
  55. Koenderink JJ (1984) The structure of images. Biol Cybern 50:363–370CrossRefGoogle Scholar
  56. Kohonen T (1988) Self-organization and associative memory. Springer, BerlinGoogle Scholar
  57. Kosko B (1992) Neural networks and fuzzy systems. Prentice-Hall, Englewood Cliffs, NJGoogle Scholar
  58. Kryszkiewicz M (2001) Comparative study of alternative types of knowledge reduction in inconsistent systems. Int J Intell Syst 16:105–120CrossRefGoogle Scholar
  59. Kulkarni AD (1994) Artificial neural networks for image understanding. Van Nostrand Reinhold, New YorkGoogle Scholar
  60. Leung Y (1982) Approximate characterization of some fundamental concepts of spatial analysis. Geogr Anal Int J Theor Geogr 14:19–40Google Scholar
  61. Leung Y (1997) Intelligent spatial decision support systems. Springer, BerlinGoogle Scholar
  62. Leung Y (2001) Neural and evolutionary computation methods for spatial classification and knowledge acquisition. In: Fisher MM, Leung Y (eds) GeoComputational modelling: techniques and applications. Springer, Berlin, pp 71–108Google Scholar
  63. Leung Y, Leung KS (1993a) An intelligent expert system shell for knowledge-based geographical information systems: 1. the tools. Int J Geogr Inform Syst 7:189–199CrossRefGoogle Scholar
  64. Leung Y, Li DY (2003) Maximal consistent block technique for rule acquisition in incomplete information systems. Inform Sci 153:85–106CrossRefGoogle Scholar
  65. Leung Y, Gao Y, Zhang WX (2001b) A genetic-based method for training fuzzy systems. In: Proceedings of the 10th IEEE international conference on fuzzy systems – meeting the ground challenge: machines that serve people, organized by the institute of electrical and electronics engineers. Australia, MelbourneGoogle Scholar
  66. Leung Y, Leung KS, Yuan XJ (2003c) Discovery of promotion strategies for banking services by classification trees (unpublished paper)Google Scholar
  67. Leung Y, Luo JC, Zhou CH (2002a) A knowledge-integrated radial basis function model for the classification of multispectral remote sensing images (unpublished paper)Google Scholar
  68. Leung Y, Ma JH, Zhang WX (2001b) A New method for mining regression classes in Large data sets. IEEE Trans Pattern Anal Mach Intell 23(1):5–21CrossRefGoogle Scholar
  69. Leung Y, Mei CL, Zhang WX (2000a) Statistical tests for spatial non-stationarity based on geographically weighted regression model. Environ Plann A 32:9–32CrossRefGoogle Scholar
  70. Leung Y, Wu WZ, Zhang WX (2006a) Knowledge acquisition in incomplete information systems: a rough set approach. Eur J Oper Res 168:164–180CrossRefGoogle Scholar
  71. Leung Y, Fischer MM, Wu WZ, Mi JS (2008c) A rough set approach for the discovery of classification rules in interval-valued information systems. Int J Approx Reason 47:233–246CrossRefGoogle Scholar
  72. Leung Y, Fung T, Mi JS, Wu WZ (2007) A rough set approach to the discovery of classification rules in spatial data. Int J Geogr Inform Sci 21:1033–1058CrossRefGoogle Scholar
  73. Lippmann RP (1994) Neural networks, Bayesion a posteriori probabilities, and pattern classification. In: Cherkassky V, Friedman JH (eds) From statistics to neural networks–- theory and pattern recognition applications. Germany, Springer, pp 83–104Google Scholar
  74. Loh WY, Vanichsetakul N (1988) Tree-structured classification via generalized discriminant analysis (with discussion). J Am Stat Assoc 83:715–728CrossRefGoogle Scholar
  75. Luo JC, Leung Y, Zheng J, Ma JH (2004) An elliptical basis function for the classification of remote sensing images. J Geogr Syst 6:219–236CrossRefGoogle Scholar
  76. Mak M, Kung S (2000) Estimation of elliptical basis function parameters by the EM algorithm with application to speaker verification. IEEE Trans Neural Network 11(4):961–969CrossRefGoogle Scholar
  77. Mannan B, Roy J, Ray AK (1998) Fuzzy ARTMAP supervised classification of multi-spectral remotely-sensed images. Int J Rem Sens 19(4):767–774CrossRefGoogle Scholar
  78. Mather PM (1999) Land cover classification revisited. In: Atkinson PM, Tate NJ (eds) Advances in remote sensing and GIS analysis. Wiley, London, pp 7–16Google Scholar
  79. McLachlan GJ, Basford KE (1988) Mixture models: inference and applications to clustering. Marcel Dekker, New YorkGoogle Scholar
  80. McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, LondonGoogle Scholar
  81. Medsker LR (1994) Hybrid neural network and expert systems. Kluwer, DordrechtGoogle Scholar
  82. Meng DY, Xu ZB (2006) Visual learning theory (unpublished paper)Google Scholar
  83. Meng DY, Xu ZB, Leung Y, Fung T (2008) The strong convergence of visual method and its applications in disease diagnosis. Paper presented at the 3rd international conference on pattern recognition in bioinformatics, Melbourne AustraliaGoogle Scholar
  84. Mi JS, Wu WZ, Zhang WX (2004) Approaches to knowledge reduction based on variable precision rough sets model. Inform Sci 159:255–272CrossRefGoogle Scholar
  85. Mola F, Siciliano R (1997) A fast splitting procedure for classification trees. Stat Comput 7:208–216CrossRefGoogle Scholar
  86. Moody J, Darken CJ (1989) Fast learning in network of locally-turned processing units. Neural Comput 1:281–294CrossRefGoogle Scholar
  87. Murai H, Omatu S (1997) Remote sensing image analysis using a neural network and knowledge-based processing. Int J Rem Sens 18(4):811–828CrossRefGoogle Scholar
  88. Pao YH (1989) Adaptive pattern recognition and neural networks. Addison-Wesley, Reading, MAGoogle Scholar
  89. Paola JD, Schowengerdt RA (1995) A review and analysis of back-propagation neural networks for classification of remotely-sensed multi-spectral imagery. Int J Rem Sens 16:3033–3058CrossRefGoogle Scholar
  90. Park D, Kandel A, Langholz G (1994) Genetic-based new fuzzy reasoning models with application to fuzzy control. IEEE Trans Syst Man Cybern 24(1):39–47CrossRefGoogle Scholar
  91. Pawlak Z (1982) Rough sets. Int J Inform Comput Sci 11:341–356CrossRefGoogle Scholar
  92. Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, BostonGoogle Scholar
  93. Peddle DR (1995) Knowledge formulation for supervised evidential classification. Photogramm Eng Rem Sens 61(4):409–417Google Scholar
  94. Pernell C, Themlin J, Renders J, Acheroy M (1995) Optimization of fuzzy expert systems using genetic algorithms and neural networks. IEEE Trans Fuzzy Syst 3(3):300–312CrossRefGoogle Scholar
  95. Polkowski L, Skowron A (eds) (1998) Rough sets in knowledge discovery 1: methodology and applications, 2: Applications. Physica-Verlag, HeidelbergGoogle Scholar
  96. Polkowski L, Tsumoto S, Lin TY (2000) Rough set methods and applications. Physica-Verlag, HeidelbergGoogle Scholar
  97. Powell MJD (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Algorithms for Approximation of Functions and Data. Oxford University Press, Oxford, pp 143–167Google Scholar
  98. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106Google Scholar
  99. Richards JA, Jia XP (1998) Remote sensing digital image analysis: an introduction. Springer, New YorkGoogle Scholar
  100. Ripley BD (1996) Patter recognition and neural networks. Cambridge University Press, CambridgeGoogle Scholar
  101. Rosenblatt F (1958) The Perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65:386–408CrossRefGoogle Scholar
  102. Rudolph G (1994) Convergence properties of canonical genetic algorithms. IEEE Trans Neural Network 5(1):96–101CrossRefGoogle Scholar
  103. Scholkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods: support vector learning. MIT, CambridgeGoogle Scholar
  104. Serpico SB, Bruzzone L, Roli F (1996) An experimental comparison of neural and statistical non-parametric algorithms for supervised classification of remote-sensing images. Pattern Recogn Lett 17:1331–1341CrossRefGoogle Scholar
  105. Shafer G (1976) A mathematical theory of evidence. Princeton, Princeton University PressGoogle Scholar
  106. Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Slowinski R (ed) Intelligent decision support-Handbook of applications and advances of the rough sets theory. Kluwer, London, pp 331–362Google Scholar
  107. Stell JG, Worboys MF (1998) Stratified map spaces: a formal basis for multiresolution spatial databases. In: Poiker TK, Chisman N (eds) SDH’98 proceedings 8th international symposium on spatial data handling. International Geographical Union, pp. 180–189Google Scholar
  108. Sundararajan N, Saratchandran P, Lu Y (1999) Radial basis function neural networks with sequential learning. World Scientific, SingaporeGoogle Scholar
  109. Tadjudin S, Landgrebe DA (2000) Robust parameter estimation for mixture model. IEEE Trans Geosci Rem Sens 38(1):439–445CrossRefGoogle Scholar
  110. Wang F (1991) Integrating GIS’s and remote sensing image analysis systems by unifying knowledge representation schemes. IEEE Trans Geosci Rem Sens 29(4):656–663CrossRefGoogle Scholar
  111. Wang SL, Li D, Shi WZ, Wang XZ (2002) Geo-rough space. Geo-Spatial Inform Sci 6:54–61Google Scholar
  112. Wang SL, Wang XZ, Shi WZ (2001) Development of a data mining method for land control. Geo-Spatial Inform Sci 4:68–76Google Scholar
  113. Wilkinson GG, Folving S, Kanellopoulos I, McCormick N, Fullerton K, Megier J (1995) Forest mapping from multi-source satellite data using neural network classifiers - an experiment in Portugal. Rem Sens Rev 12:83–106Google Scholar
  114. Witkin AP (1983) Scale space filtering. In: Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, pp. 1019–1022Google Scholar
  115. Worboys MF (1998a) Computation with imprecise geographical data. Comput Environ Urban Syst 22:85–106CrossRefGoogle Scholar
  116. Wu WZ, Zhang M, Li HZ, Mi JS (2005) Knowledge reduction in random information systems via Dempster-Shafer theory of evidence. Inform Sci 174:143–164CrossRefGoogle Scholar
  117. Xu ZB, Leung Y, He XW (1994) Asymmetric bidirectional associative memories. IEEE Trans Syst Man Cybern 24:1558–1564CrossRefGoogle Scholar
  118. Yao X (1999) Evolving artificial neural networks. In: Proceedings of the IEEE 89, IEEE, pp. 1423–1447Google Scholar
  119. Yasdi R (1996) Combining rough sets learning and neural learning: method to deal with uncertain and imprecise information. Neuralcomputing 7:61–84CrossRefGoogle Scholar
  120. Zadeh LA (1994) Fuzzy logical and soft computing: issues, contentions and perspectives. In: Proceedings of 3rd international conference on fuzzy logical, neural networks and soft computing. Fuzzy Logic Systems Institute, Japan, pp. 1–2Google Scholar
  121. Zhang WX, Mi JS, Wu WZ (2003b) Approaches to knowledge reductions in inconsistent systems. Int J Intell Syst 18:989–1000CrossRefGoogle Scholar
  122. Zhou W (1999) Verification of the non-parametric characteristics of back-propagation neural networks for image classification. IEEE Trans Geosci Rem Sens 37(2):771–779CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Dept. of Geography & Resource Management ShatinThe Chinese University of Hong KongNew TerritoriesHong Kong SAR

Personalised recommendations