Skip to main content

Identifying Uncertain Galaxy Morphologies Using Unsupervised Learning

  • Conference paper
Artificial Intelligence and Soft Computing (ICAISC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7895))

Included in the following conference series:

Abstract

With the onset of massive cosmological data collection through mediums such as the Sloan Digital Sky Survey (SDSS), galaxy classification has been accomplished for the most part with the help of citizen science communities like Galaxy Zoo. However, an analysis of one of the Galaxy Zoo morphological classification data sets has shown that a significant majority of all classified galaxies are, in fact, labelled as ”Uncertain”. This has driven us to conduct experiments with data obtained from the SDSS database using each galaxy’s right ascension and declination values, together with the Galaxy Zoo morphology class label, and the k-means clustering algorithm. This paper identifies the best attributes for clustering using a heuristic approach and, accordingly, applies an unsupervised learning technique in order to improve the classification of galaxies labelled as ”Uncertain” and increase the overall accuracies of such data clustering processes. Through this heuristic approach, it is observed that the accuracy of classes-to-clusters evaluation, by selecting the best combination of attributes via information gain, is further improved by approximately 10-15%. An accuracy of 82.627% was also achieved after conducting various experiments on the galaxies labelled as ”Uncertain” and replacing them back into the original data set. It is concluded that a vast majority of these galaxies are, in fact, of spiral morphology with a small subset potentially consisting of stars, elliptical galaxies or galaxies of other morphological variants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ball, N.M., Brunner, R.J.: Data Mining and Machine Learning in Astronomy. International Journal of Modern Physics D, 61 (2009)

    Google Scholar 

  2. Stoughton, C., Lupton, R.H., Bernardi, M., Blanton, M.R., Burles, S., Castander, F.J., et al.: Sloan Digital Sky Survey: Early Data Release. The Astronomical Journal 123(1), 485 (2007)

    Article  Google Scholar 

  3. Borne, K.: Scientific Data Mining in Astronomy. In: Next Generation of Data Mining, pp. 91–114 (2009)

    Google Scholar 

  4. Henrion, M., Mortlock, D.J., Hand, D.J., Gandy, A.: A Bayesian Approach to Star-Galaxy Classification. In: Monthly Notices of the Royal Astronomical Society, pp. 2286–2302 (2011)

    Google Scholar 

  5. Kamar, E., Hacker, S., Horvitz, E.: Combining Human and Machine Intelligence in Large-Scale Crowdsourcing. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, pp. 467–474 (2012)

    Google Scholar 

  6. de la Calleja, J., Fuentes, O.: Automated Classification of Galaxy Images. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 411–418. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Gauci, A., Adami, K.Z., Abela, J.: Machine Learning for Galaxy Morphology Classification. arXiv:1005.0390, pp. 1–9 (2010)

    Google Scholar 

  8. Vasconcellos, E.C., de Carvalho, R.R., Gal, R.R., LaBarbera, F.L., Capelato, H.V., Velho, H.F.C., Ruiz, R.S.R.: Decision Tree Classifiers for Star/Galaxy Separation. The Astronomical Journal 141, 189 (2011)

    Article  Google Scholar 

  9. Banerji, M., Lahav, O., Lintott, C.J., Abdalla, F.B., Schawinski, K., Bamford, S.P., Andreescu, D., Murray, P., Raddick, M.J., Slosar, A., Szalay, A., Thomas, D., Vandenberg, J.: Galaxy Zoo: Reproducing Galaxy Morphologies Via Machine Learning. In: Monthly Notices of the Royal Astronomical Society, pp. 342–353 (2010)

    Google Scholar 

  10. Baehr, S., Vedachalam, A., Borne, K.D., Sponseller, D.: Data Mining the Galaxy Zoo Mergers. In: 2010 Conference on Intelligent Data Understanding (2010)

    Google Scholar 

  11. Ball, N.M., Loveday, J., Fukugita, M., Nakamura, O., Okamura, S., Brinkmann, J., Brunner, R.J.: Galaxy Types in the Sloan Digital Sky Survey Using Supervised Artificial Neural Networks. In: Monthly Notices of the Royal Astronomical Society, pp. 1038–1046 (2004)

    Google Scholar 

  12. Scaringi, S., Cottis, C.E., Knigge, C., Goad, M.R.: Broad Absorption Line Quasar Catalogues with Supervised Neural Networks. arXiv:0810.4396 (2008)

    Google Scholar 

  13. Bazell, D., Peng, Y.: A Comparison of Neural Network Algorithms and Preprocessing Methods for Star-Galaxy Discrimination. The Astrophysical Journal Supplement Series 116(1), 47 (2009)

    Article  Google Scholar 

  14. Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Nave Bayes. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 249–256 (2002)

    Google Scholar 

  15. Gao, D., Zhang, Y.X., Zhao, Y.H.: Random Forest Algorithm for Classification of Multi-wavelength Data. Research in Astronomy and Astrophysics 9(2), 220 (2009)

    Article  MATH  Google Scholar 

  16. Von Luxburg, U., Bousquet, O., Belkin, M.: Limits of Spectral Clustering. In: Advances in Neural Information Processing Systems (NIPS), pp. 857–864 (2005)

    Google Scholar 

  17. Bradley, P.S., Fayyad, U., Reina, C.: Scaling EM (Expectation-Maximization) Clustering to Large Databases. In: Microsoft Research (1998)

    Google Scholar 

  18. Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical Clustering Using Dynamic Modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  19. Ding, C., He, X.: Cluster Merging and Splitting in Hierarchical Clustering Algorithms. In: Proceedings of the 2002 IEEE International Conference on Data Mining, pp. 139–146 (2002)

    Google Scholar 

  20. Bengio, Y., Paiement, J.F., Vincent, P., Delalleau, O., Le Roux, N., Ouiment, M.: Out-of-Sample Extensions for Lle, Isomap, Mds, Eigenmaps and Spectral Clustering. In: Advances in Neural Information Processing Systems, vol. 16, pp. 177–184 (2004)

    Google Scholar 

  21. Huang, Z.: Extensions to the K-means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery 2(3), 283–304 (1998)

    Article  Google Scholar 

  22. Alsabti, K., Ranka, S., Singh, V.: An Efficient K-Means Clustering Algorithm. Electrical Engineering and Computer Science (43) (1997)

    Google Scholar 

  23. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An Efficient K-Means Clustering Algorithm: Analysis and Implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)

    Article  Google Scholar 

  24. Lintott, C., Schawinski, K., Bamford, S., Slosar, A., Land, K., Thomas, D., et al.: Galaxy Zoo 1: Data Release of Morphological Classifications for Nearly 900.000 Galaxies. Monthly Notices of the Royal Astronomical Society 410(1), 166–178 (2011)

    Article  Google Scholar 

  25. Abazajian, K.N., Adelman-McCarthy, J.K., Agueros, M.A., Allam, S.S., Prieto, C.A., An, D., et al.: The Seventh Data Release of the Sloan Digital Sky Survey. The Astrophysical Journal Supplement Series, 543 (2009)

    Google Scholar 

  26. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  27. Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Edwards, K.J., Gaber, M.M. (2013). Identifying Uncertain Galaxy Morphologies Using Unsupervised Learning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2013. Lecture Notes in Computer Science(), vol 7895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38610-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38610-7_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38609-1

  • Online ISBN: 978-3-642-38610-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics