Analysis and Knowledge Discovery by Means of Self-Organizing Maps for Gaia Data Releases

  • Marco Antonio ÁlvarezEmail author
  • Carlos Dafonte
  • Daniel Garabato
  • Minia Manteiga
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9950)


A billion stars: this is the approximate amount of visible objects estimated to be observed by the Gaia satellite, representing roughly 1 % of the objects in the Galaxy. It constitutes the biggest amount of data gathered to date: by the end of the mission, the data archive will exceed 1 Petabyte. Now, in order to process this data, the Gaia mission conceived the Data Processing and Analysis Consortium, which will apply data mining techniques such as Self-Organizing Maps. This paper shows a useful technique for source clustering, focusing on the development of an advanced visualization tool based on this technique.


Gaia mission European Space Agency Data mining Artificial Intelligence Self-Organizing Maps visualizations 


  1. 1.
    SIMBAD Astronomical Database.
  2. 2.
    Simple Application Messaging Protocol.
  3. 3.
    del Coso, C., Fustes, D., Dafonte, C., Nóvoa, F.J., Rodríguez-Pedreira, J.M., Arcay, B.: Mixing numerical and categorical data in a self-organizing map by means of frequency neurons. Appl. Soft Comput. 36, 246–254 (2015). CrossRefGoogle Scholar
  4. 4.
    Fustes, D., Dafonte, C., Arcay, B., Manteiga, M., Smith, K., Vallenari, A., Luri, X.: SOM ensemble for unsupervised outlier analysis. Application to outlier identification in the Gaia astronomical survey. Expert Syst. Appl. 40(5), 1530–1541 (2013). CrossRefGoogle Scholar
  5. 5.
    Fustes, D., Manteiga, M., Dafonte, C., Arcay, B., Ulla, A., Smith, K., Borrachero, R., Sordo, R.: An approach to the analysis of SDSS spectroscopic outliers based on self-organizing maps. Astron. Astrophys. 559, A7 (2013). CrossRefGoogle Scholar
  6. 6.
    Geach, J.E.: Unsupervised self-organized mapping: a versatile empirical tool for object selection, classification and redshift estimation in large surveys. MNRAS 419, 2633–2645 (2012)CrossRefGoogle Scholar
  7. 7.
    Kaski, S.: Data exploration using self-organizing maps. In: Acta Polytechnica Scandinavica, Mathematics, Computing and Management in Engineering Series (82), March, 1997Google Scholar
  8. 8.
    Kohonen, T.: Self-organized formation of topologically correct feature maps. In: Neurocomputing: Foundations of Research, pp. 509–521. MIT Press, Cambridge (1988).
  9. 9.
    Ordóñez, D., Dafonte, C., Varela, B.A., Manteiga, M.: HSC: a multi-resolution clustering strategy in self-organizing maps applied to astronomical observations. Appl. Soft Comput. 12(1), 204–215 (2012). CrossRefGoogle Scholar
  10. 10.
    Valette, V., Amsif, K.: CNES Gaia Data Processing Centre, a complex operation plan. In: 12th International Conference on Space Operations, June, 2012.
  11. 11.
    White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media Inc., Sebastopol (2009)Google Scholar
  12. 12.
    Wills, J., Owen, S., Laserson, U., Ryza, S.: Advanced Analytics with Spark: Patterns for Learning from Data at Scale, 1st edn. O’Reilly Media Inc., Sebastopol (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Marco Antonio Álvarez
    • 1
    Email author
  • Carlos Dafonte
    • 1
  • Daniel Garabato
    • 1
  • Minia Manteiga
    • 2
  1. 1.Department de Tecnologías de la Información y las ComunicacionesUniversidade da Coruña (UDC)A CoruñaSpain
  2. 2.Department de Ciencias de la Navegación y de la TierraUniversidade da Coruña (UDC)A CoruñaSpain

Personalised recommendations