Skip to main content

Glowworm Swarm Based Informative Attribute Selection Using Support Vector Machines for Simultaneous Feature Selection and Classification

  • Conference paper
  • First Online:
Swarm, Evolutionary, and Memetic Computing (SEMCCO 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8947))

Included in the following conference series:

Abstract

In this paper, we propose a hybrid filter-wrapper algorithm, GSO-Infogain, for simultaneous feature selection for improved classification accuracy. GSO-Infogain employs Glowworm-Swarm Optimization(GSO) algorithm with Support Vector Machine(SVM) as its internal learning algorithm and utilizes feature ranking based on information gain as a heuristic. The GSO algorithm randomly generates a population of worms, each of which is a candidate subset of features. The fitness of each candidate solution, which is evaluated using Support Vector Machine, is encoded within its luciferin value. Each worm probabilistically moves towards the worm with the highest luciferin value in its neighbourhood. In the process, they explore the feature space and eventually converge to the global optimum. We have evaluated the performance of the hybrid algorithm for feature selection on a set of cancer datasets. We obtain a classification accuracy in the range 94-98 % for these datasets, which is comparable to the best results from other classification algorithms. We further tested the robustness of GSO-Infogain by evaluating its performance on the CoEPrA training and test datasets. GSO-Infogain performs well in this experiment too by giving similar prediction accuracies on the training and test datasets thus indicating its robustness.

V.N. gratefully acknowledges Council of Scientific and Industrial Research, New Delhi for awarding a Junior Research Fellowship.

V.K.J. gratefully acknowledges financial support from Department of Science and Technology, New Delhi.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bellman, R.E.: Adaptive control processes - A guided tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  2. Ng, A.Y.: On feature selection: learning with exponentially many irrelevant features as training examples. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 404–412, Morgan Kaufmann (1998)

    Google Scholar 

  3. Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theor. 14(1), 55–63 (1968)

    Article  Google Scholar 

  4. Webb, A.R.: Statistical Pattern Recognition, 2nd edn. John Wiley & Sons, NJ (2002)

    Book  MATH  Google Scholar 

  5. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial intelligence, AAAI 1992, pp. 129–134. AAAI Press (1992)

    Google Scholar 

  6. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  7. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees, 1st edn. Chapman and Hall/CRC, London (1984)

    MATH  Google Scholar 

  8. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  9. Nair, V., Dutta, M., Manian, S.S., Kumari, R., Jayaraman, V.K.: Identification of penicillin-binding proteins employing support vector machines and random forest. Bioinformation 9(9), 481 (2013)

    Article  Google Scholar 

  10. Brown, M.P., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C.W., Furey, T.S., Ares, M., Haussler, D.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. 97(1), 262–267 (2000)

    Article  Google Scholar 

  11. Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  12. Guo, G., Li, S.Z., Chan, K.L.: Face recognition by support vector machines. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 196–201, IEEE (2000)

    Google Scholar 

  13. Pontil, M., Verri, A.: Support vector machines for 3d object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 20(6), 637–646 (1998)

    Article  Google Scholar 

  14. Rowley, H.A., Jing, Y., Baluja, S.: Large scale image-based adult-content filtering. In: VISAPP (1), pp. 290–296, Citeseer (2006)

    Google Scholar 

  15. Sculley, D., Wachman, G.M.: Relaxed online svms for spam filtering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 415–422. ACM (2007)

    Google Scholar 

  16. Krishnanand, K.N., Ghose, D.: Detection of multiple source locations using a glowworm metaphor with applications to collective robotics. In: Proceedings 2005 IEEE Swarm Intelligence Symposium, SIS 2005, pp. 84–91 (2005)

    Google Scholar 

  17. Colorni, A., Dorigo, M., Maniezzo, V., et al.: Distributed optimization by ant colonies. In: Proceedings of the First European Conference on Artificial Life. vol. 142, pp. 134–142, Paris, France (1991)

    Google Scholar 

  18. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statist. 22(1), 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  19. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intel. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  20. Kent, J.T.: Information gain and a general measure of correlation. Biometrika 70(1), 163–173 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  21. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  22. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  23. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  24. West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J.A., Marks, J.R., Nevins, J.R.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. 98(20), 11462–11467 (2001)

    Article  Google Scholar 

  25. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)

    Article  MATH  Google Scholar 

  26. Mohammadi, A., Saraee, M.H., Salehi, M.: Identification of disease-causing genes using microarray data mining and gene ontology. BMC Med. Genomics 4(1), 12 (2011)

    Article  Google Scholar 

  27. Sharma, S., Ghosh, S., Anantharaman, N., Jayaraman, V.K.: Simultaneous informative gene extraction and cancer classification using aco-antminer and aco-random forests. In: Proceedings of the International Conference on Information Systems Design and Intelligent Applications 2012 (INDIA 2012) held in Visakhapatnam, India, pp. 755–761.Springer, January 2012

    Google Scholar 

  28. Nikumbh, S., Ghosh, S., Jayaraman, V.K.: Biogeography-based informative gene selection and cancer classification using svm and random forests. In: 2012 IEEE Congress on Evolutionary Computation (CEC), pp. 1–6. IEEE (2012)

    Google Scholar 

  29. Blanco, Á., Martín-Merino, M., De Las Rivas, J.: Combining dissimilarity based classifiers for cancer prediction using gene expression profiles. BMC Bioinform. 8(Suppl 8), S3 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayaraman Valadi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gurav, A., Nair, V., Gupta, U., Valadi, J. (2015). Glowworm Swarm Based Informative Attribute Selection Using Support Vector Machines for Simultaneous Feature Selection and Classification. In: Panigrahi, B., Suganthan, P., Das, S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2014. Lecture Notes in Computer Science(), vol 8947. Springer, Cham. https://doi.org/10.1007/978-3-319-20294-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20294-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20293-8

  • Online ISBN: 978-3-319-20294-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics