Advertisement

An Analysis of Meta-learning Techniques for Ranking Clustering Algorithms Applied to Artificial Data

  • Rodrigo G. F. Soares
  • Teresa B. Ludermir
  • Francisco A. T. De Carvalho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5768)

Abstract

Meta-learning techniques can be very useful for supporting non-expert users in the algorithm selection task. In this work, we investigate the use of different components in an unsupervised meta-learning framework. In such scheme, the system aims to predict, for a new learning task, the ranking of the candidate clustering algorithms according to the knowledge previously acquired.

In the context of unsupervised meta-learning techniques, we analyzed two different sets of meta-features, nine different candidate clustering algorithms and two learning methods as meta-learners.

Such analysis showed that the system, using MLP and SVR meta-learners, was able to successfully associate the proposed sets of dataset characteristics to the performance of the new candidate algorithms. In fact, a hypothesis test showed that the correlation between the predicted and ideal rankings were significantly higher than the default ranking method. In this sense, we also could validate the use of the proposed sets of meta-features for describing the artificial learning tasks.

Keywords

Meta-learning Clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adya, J.A.M., Collopy, F., Kennedy, M.: Automatic identification of time series features for rule-based forecasting. International Journal of Forecasting 17(2), 143–157 (2001)CrossRefGoogle Scholar
  2. 2.
    Aha, D.W.: Generalizing from case studies: A case study. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 1–10. Morgan Kaufmann, San Francisco (1992)Google Scholar
  3. 3.
    Brazdil, P.B., Soares, C., Da Costa, J.P.: Ranking learning algorithms: Using ibl and meta-learning on accuracy and time results. Machine Learning 50(3), 251–277 (2003)CrossRefzbMATHGoogle Scholar
  4. 4.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
  5. 5.
    Engels, R., Theusinger, C.: Using a data metric for preprocessing advice for data mining applications. In: European Conference on Artificial Intelligence, pp. 430–434 (1998)Google Scholar
  6. 6.
    Ertoz, L., Steinbach, M., Kumar, V.: A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, pp. 105–115 (2002)Google Scholar
  7. 7.
    Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Menlo Park (1996)Google Scholar
  8. 8.
    Handl, J., Knowles, J.: Cluster generators for large high-dimensional data sets with large numbers of clusters (2008), http://dbkgroup.org/handl/generators
  9. 9.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)zbMATHGoogle Scholar
  10. 10.
    Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 5th edn. Prentice Hall, Englewood Cliffs (2002)zbMATHGoogle Scholar
  11. 11.
    Kalousis, A., Hilario, M.: Feature selection for meta-learning. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD 2001. LNCS (LNAI), vol. 2035, pp. 222–233. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Kalousis, A., Theoraris, T.: Noemon: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis 3(5), 319–337 (1999)CrossRefzbMATHGoogle Scholar
  13. 13.
    Kalousis, A., Gama, J., Hilario, M.: On data and algorithms: Understanding inductive performance. Machine Learning 54(3), 275–312 (2004)CrossRefzbMATHGoogle Scholar
  14. 14.
    Kalousis, A., Hilario, M.: Representational issues in meta-learning. In: ICML, pp. 313–320 (2003)Google Scholar
  15. 15.
    Michie, D., Spiegelhalter, D.J., Taylor, C.C., Campbell, J.: Machine learning, neural and statistical classification. Ellis Horwood, Upper Saddle River (1994)zbMATHGoogle Scholar
  16. 16.
    Pelleg, D., Moore, A.W.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco (2000)Google Scholar
  17. 17.
    Prudêncio, R.B.C., Ludermir, T.B., de A.T. de Carvalho, F.: A modal symbolic classifier for selecting time series models. Pattern Recognition Letters 25(8), 911–921 (2004)CrossRefGoogle Scholar
  18. 18.
    Prudêncio, R.B.C., Ludermir, T.B.: Meta-learning approaches to selecting time series models. Neurocomputing 61, 121–137 (2004)CrossRefGoogle Scholar
  19. 19.
    Chen, P.H., Fan, R.E., Lin, C.J.: Working set selection using the second order information for training svm. Journal of Machine Learning Research 6, 1889–1918 (2005)zbMATHGoogle Scholar
  20. 20.
    Soares, R.G.F.: The use of meta-learning techniques for selecting and ranking clustering algorithms applied to gene expression data (in portuguese). Master’s thesis, Federal University of Pernambuco - Center of Informatics (2008)Google Scholar
  21. 21.
    Souto, M.C.P., Prudêncio, R.B., Soares, R.G.F., Araújo, D.A.S., Filho, I.G.C., Ludermir, T.B., Schliep, A.: Ranking and selecting clustering algorithms using a meta-learning approach. In: IEEE (ed.) Proceedings of International Joint Conference on Neural Networks, pp. 3729–3735 (2008)Google Scholar
  22. 22.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  23. 23.
    Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rodrigo G. F. Soares
    • 1
  • Teresa B. Ludermir
    • 1
  • Francisco A. T. De Carvalho
    • 1
  1. 1.Center of InformaticsFederal University of PernambucoBrazil

Personalised recommendations