Choosing representative data items: Kohonen, Neural Gas or Mixture Model?
When analyzing the erosion risk of Kefallinia, Greece, we have faced the problem, how to choose representatives (prototypes) for a big data set. We consider 3 methods serving this purpose: 1 - Kohonen’s self-organizing map (SOM), 2 - Neural gas (NG), and 3. Mixture model (MM) of Gaussian distributions. The representativeness of the derived prototype vectors is measured by the quantization error, as defined by Kohonen (1995). It appears that neural gas and mixture models surpass quite steadily the SOM method in providing better representatives. To obtain a more thorough insight into the results, we map the obtained prototype vectors onto planes obtained by the neuroscale mapping, which seems to be a convenient alternative to Sammon’s mapping. The SOM codebook vectors are visualized in the same planes and linked by threads. This is shown for the Kefallinia erosion data from Greece.
KeywordsSelf-organizing maps Neural gas Mixture models Neuroscale mapping Thread plotting Kefallinia Island
Unable to display preview. Download preview PDF.
- Bartkowiak A., Vassilopoulos A., Evelpidou N. 2003. ‘Choosing data vectors representing a huge data set: Kohonen’s SOM applied to the Kefallinia erosion data’. Proceedings of the First Int. Conf. on Environmental Research and Assessment, Bucharest, Romania, March 23–27, 2003, pp. 505–522, ISBN 973-558-077-2, print on CD-ROM, © Ars Docendi Publishing House, Bucharest, Romania.Google Scholar
- Bartkowiak A, Szustalewicz A., Evelpidou N, Vassilopoulos A. 2003. ‘Choosing data vectors representing a huge data set: a comparison of Kohonen’s maps and the neural gas method’. Proceedings of the First Int. Conf. on Environmental Research and Assessment, Bucharest, Romania, March 23–27, 2003, pp. 561–572, ISBN 973-558-077-2, print on CD-ROM, © Ars Docendi Publishing House, Bucharest, Romania.Google Scholar
- Gournellos T., Evelpidou N., Vassilopolous A. 2003. ‘Developing an erosion risk map using soft computing’. Natural Hazards, Kluwer, to appear.Google Scholar
- Kohonen T. 1995. Self-Organizing Maps. Series in Information Science. Vol. 30. Heidelberg, Springer. Second Edition. 1997.Google Scholar
- Matlab: The Language of Technical Computing. 2002. Version 6p5. The Mathworks Inc., Natick MA USA.Google Scholar
- McLachlan G., Peel D. 2000. Finite Mixture Models. Wiley, New York, Chichester.Google Scholar
- Nabney I. T. 2001. Netlab: Algorithms for Pattern Recognition. Springer London, Berlin, Heidelberg. Springer Series: Advances in Pattern Recognition.Google Scholar
- Osowski S. 1996. Sieci neuronowe w ujęciu algorytmicznym. WNT Warszawa.Google Scholar
- Sammon J. W. (Jr.). 1969. ‘A nonlinear mapping for data structure analysis’. IEEE Trans, on Computers C-18(5), pp. 401–409.Google Scholar
- Tipping M. E., Lowe D. 1997. ‘Shadow targets: A novel algorithm for topographic projections by radial basis functions’. In: Proceedings of the Int. Conf. on Artificial Neural Networks 440, pp. 7–12, IEE.Google Scholar
- Vassilopoulos A. 2002. ‘Coastal geomorphological classifications in GIS environment of Kefallinia Island’. Proc. 6th Pan-Hellenic Geographical Congress of the Hellenic Geographical Society, Thessaloniki 3–5.10.2002, Vol. 1., pp. 388–394.Google Scholar
- Vesanto J., Himberg J., Alhoniemi E., Parhankangas J. 2000. SOM Toolbox for Matlab 5. Som Toolbox Team. Helsinki University of Technology, Finland, Libella Oy Espoo. See also: http://www.cis.hut.fi/projects/somtoolbox/.Google Scholar