Advances in self-organizing maps for their application to compositional data
- 185 Downloads
A self-organizing map (SOM) is a non-linear projection of a D-dimensional data set, where the distance among observations is approximately preserved on to a lower dimensional space. The SOM arranges multivariate data based on their similarity to each other by allowing pattern recognition leading to easier interpretation of higher dimensional data. The SOM algorithm allows for selection of different map topologies, distances and parameters, which determine how the data will be organized on the map. In the particular case of compositional data (such as elemental, mineralogical, or maceral abundance), the sample space is governed by Aitchison geometry and extra steps are required prior to their SOM analysis. Following the principle of working on log-ratio coordinates, the simplicial operations and the Aitchison distance, which are appropriate elements for the SOM, are presented. With this structure developed, a SOM using Aitchison geometry is applied to properly interpret elemental data from combustion products (bottom ash, fly ash, and economizer fly ash) in a Wyoming coal-fired power plant. Results from this effort provide knowledge about the differences between the ash composition in the coal combustion process.
KeywordsAitchison distance Coal combustion products Isometric logratio Proportions Simplex
This work has been supported by the project “CODA-RETOS” (Spanish Ministry of Economy and Competitiveness; Ref: MTM2015-65016-C2-1-R) and the project “Compositional Data Analysis Related to Energy Resources Modeling” (“Salvador de Madariaga” program; “Fulbright” distinction; MECD; Ref.: PRX16/00258). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. We are grateful to C.Ö. Karacan (USGS) and G. Mateu-Figueras (U. de Girona) for their insightful review of a previous version of the paper.
- Affolter RH, Groves S, Betterton W, Benzel W, Conrad KL, Swanson SM, Ruppert LF, Clough JG, Belkin HE, Kolker A, Hower JC (2011) Geochemical database of feed coal and coal combustion products (CCPs) from five power plants in the United States. U.S. Geological Survey Data Series 635, pamphlet, 19 ppGoogle Scholar
- Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability, Chapman & Hall/CRC. Reprinted in 2003 by The Blackburn Press, Caldwell, NJGoogle Scholar
- Aitchison J (2008) The single principle of compositional data analysis, continuing fallacies, confusions and misunderstandings and some suggested remedies. In: Daunis-i-Estadella J, Martín-Fernández JA (eds) Proceedings of CODAWORK’08, The 3rd Compositional Data Analysis Workshop, May 27–30, University of Girona, Girona (Spain), CD-ROM (ISBN: 978-84-8458-272-4, http://hdl.handle.net/10256/706)
- Cortés JA, Palma JL (2013) Geological applications of self-organizing maps to multidimensional compositional data. Pioneer J Adv Appl Math 7(2):17–49Google Scholar
- Cox TF, Cox MAA (2001) Multidimensional scaling, 2nd edn. CRC Press, Boca Raton, p 308Google Scholar
- Egozcue JJ, Daunis-i-Estadella J, Pawlowsky-Glahn V, Hron K, Filzmoser P (2012) Simplicial regression. The normal model. J Appl Probab Stat 6(1):87–108Google Scholar
- Jolliffe IT (2002) Principal component analysis. Springer Series in Statistics, 2nd edn. Springer, New York, p 487Google Scholar
- Kohonen T (2001) Self-organizing maps. Number 30 in Springer Series in Information Sciences, 3rd edn. Springer, Berlin, p 501Google Scholar
- Martín-Fernández JA, Daunis-i-Estadella J, Mateu-Figueras G (2015) On the interpretation of differences between groups for compositional data. SORT 39(2):231–252Google Scholar
- Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester, p 378Google Scholar
- Vasighi M, Kompany-Zareh M (2013) Classification ability of self-organizing maps in comparison with other classification methods. Commun Math Comput Chem 70:29–44Google Scholar