Astronomy and Big Data

A Data Clustering Approach to Identifying Uncertain Galaxy Morphology

  • Kieran Jay Edwards
  • Mohamed Medhat Gaber

Part of the Studies in Big Data book series (SBD, volume 6)

Table of contents

  1. Front Matter
    Pages 1-10
  2. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 1-3
  3. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 5-14
  4. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 15-30
  5. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 31-42
  6. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 43-48
  7. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 49-81
  8. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 83-88
  9. Kieran Jay Edwards, Mohamed Medhat Gaber
    Pages 89-93
  10. Back Matter
    Pages 95-104

About this book


With the onset of massive cosmological data collection through media such as the Sloan Digital Sky Survey (SDSS), galaxy classification has been accomplished for the most part with the help of citizen science communities like Galaxy Zoo. Seeking the wisdom of the crowd for such Big Data processing has proved extremely beneficial. However, an analysis of one of the Galaxy Zoo morphological classification data sets has shown that a significant majority of all classified galaxies are labelled as “Uncertain”.

This book reports on how to use data mining, more specifically clustering, to identify galaxies that the public has shown some degree of uncertainty for as to whether they belong to one morphology type or another. The book shows the importance of transitions between different data mining techniques in an insightful workflow. It demonstrates that Clustering enables to identify discriminating features in the analysed data sets, adopting a novel feature selection algorithms called Incremental Feature Selection (IFS). The book shows the use of state-of-the-art classification techniques, Random Forests and Support Vector Machines to validate the acquired results. It is concluded that a vast majority of these galaxies are, in fact, of spiral morphology with a small subset potentially consisting of stars, elliptical galaxies or galaxies of other morphological variants.


Astronomy Big Data Citizen Science Data Clustering Galaxy Morphology Galaxy Zoo Project

Authors and affiliations

  • Kieran Jay Edwards
    • 1
  • Mohamed Medhat Gaber
    • 2
  1. 1.School of ComputingUniversity of PortsmouthHampshireUnited Kingdom
  2. 2.School of Computing Science and Digital MediaRobert Gordon UniversityAberdeenUnited Kingdom

Bibliographic information

  • DOI
  • Copyright Information Springer International Publishing Switzerland 2014
  • Publisher Name Springer, Cham
  • eBook Packages Engineering Engineering (R0)
  • Print ISBN 978-3-319-06598-4
  • Online ISBN 978-3-319-06599-1
  • Series Print ISSN 2197-6503
  • Series Online ISSN 2197-6511
  • Buy this book on publisher's site
Industry Sectors
Materials & Steel
Chemical Manufacturing
Finance, Business & Banking
IT & Software
Consumer Packaged Goods
Energy, Utilities & Environment
Oil, Gas & Geosciences