• About Me
  • Résumé
  • Software
    • Classification of Galaxy Morphologies Using Support Vector Machines (2014)
    • Realtime Topic Analysis of Twitter Streams (2014)
    • Hadoop Cluster, Wikipedia, and Six Degrees of Separation (2014)
    • Creative Key (2012)
    • Tone Generator with Speech Recognition iOS App (2012)
  • Hardware
    • Virtual Reality With Haptics Integration (2014)
    • LC3b 5-Stage Pipeline Processor (2011)
    • Wireless Glove-Controlled Electric Mountainboard (2011)
    • Wireless PS2-Controlled Electric Mountainboard (2010)
    • Multi-Touch Screen (2009)
Andres Guzman-Ballen

Classification of Galaxy Morphologies Using Support Vector Machines

For our Computer Vision final project during Spring 2014, Computer Science students Ettienne Montagner, José Ruiz Cepeda and I designed a procedure to automatically perform galaxy morphology classification and reproduce manual classification results.

The first task is to preprocess the data provided by GalaxyZoo, by converting the RGB images to grayscale and then remove noise with a spatial Gaussian Smoothing Filter. Secondly, we use the Otsu's method to find a threshold that allows us to turn the image to a binary one. After that, the holes in the binary image are filled and then the program determines which object contains the most area. This is how the biggest galaxy within the image is determined. The resulting image becomes a sort of filter that is multiplied with the original image. This is how the new image only shows the biggest galaxy, and this is an approach that was applied in a publication that can be found here: A spatial-color layout feature for representing galaxy images

After preprocessing the data, we obtained the SIFT descriptors for each image in the training set and quantized them using K-means clustering to create a codebook with the cluster centers. We then trained the SVM using the quantized descriptors from the training data. Once the SVM have been trained (one for each response to every question in the decision tree), we used them to classify the test images, following the decision tree with One vs. All for the answers in each question. Our poster below demonstrates the advantages over certain descriptors as well as limitations:
Picture
Powered by Create your own unique website with customizable templates.