Abstract
Our investigation aims at extending kernel methods to interval data mining and using graphical methods to explain the obtained results. Interval data type can be an interesting way to aggregate large datasets into smaller ones or to represent data with uncertainty. No algorithmic changes are required from the usual case of continuous data other than the modification of the Radial Basis Kernel Function evaluation. Thus, kernel-based algorithms can deal easily with interval data. The numerical test results with real and artificial datasets show that the proposed methods have given promising performance. We also use interactive graphical decision tree algorithms and visualization techniques to give an insight into support vector machines results. The user has a better understanding of the models’ behavior.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the kdd-03 panel - data mining: The next 10 years. SIGKDD Explorations 5(2), 191–196 (2004)
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Magazine 17(3), 37–54 (1996)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Guyon, I.: Svm application list (1999), http://www.clopinet.com/isabelle/Projects/SVM/applist.html
Bock, H., Diday, E.: Analysis of Symbolic Data. Springer, Heidelberg (2000)
Do, T., Poulet, F.: Enhancing svm with visualization. In: Int. Conf. on Discovery Science, pp. 183–194 (2004)
Do, T., Poulet, F.: Very large datasets with svm and visualization. In: Int. Conf. on Entreprise Information Systems, pp. 127–134 (2005)
Poulet, F.: Svm and graphical algorithms: a cooperative approach. In: Proceedings of IEEE International Conference on Data Mining, pp. 499–502 (2004)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, vol. (1), pp. 281–297. University of California Press (1967)
Lin, C.: A practical guide to support vector classification (2003)
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.R.: Fisher discriminant analysis with kernels. Neural Networks for Signal Processing IX, 41–48 (1999)
Schölkopf, B., Smola, A., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 1299–1319 (1998)
Rosipal, R., Trejo, L.J.: Kernel partial least squares regression in reproducing kernel hilbert space. Journal of Machine Learning Research 2, 97–123 (2001)
Bennett, K., Campbell, C.: Support vector machines: Hype or hallelujah? SIGKDD Explorations 2(2), 1–13 (2000)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Chang, C., Lin, C.: Libsvm – a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Michie, D., Spiegelhalter, D.J., Taylor, C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Torgo, L.: Regression data sets (2003), http://www.liacc.up.pt/~ltorgo/Regression/DataSets.html
Delve: Data for evaluating learning in valid experiments (1996), http://www.cs.toronto.edu/~delve
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Caragea, D., Cook, D., Wickham, H., Honavar, V.: Visual methods for examining svm classifiers. Visual Data Mining: Theory, Techniques, and Tools for Visual Analytics (2008)
Poulet, F.: Cooperation between automatic algorithms, interactive algorithms and visualization tools for visual data mining. In: Proceedings of VDM@ECLM/PKDD 2002, 2nd International Workshop on Visual Data Mining, pp. 67–79 (2002)
Poulet, F.: Interactive decision tree construction for interval and taxonomical data. In: Proceedings of VDM@ICDM, 3nd Workshop on Visual Data Mining, pp. 183–194 (2003)
Carr, D.B., Littlefield, R.J., Nicholson, W.L.: Scatterplot matrix techniques for large n. Journal of the American Statistical Association 82(398), 424–436 (1987)
Rodriguez, O., Diday, E., Winsberg, S.: Generalization of the principal components analysis to histogram data. In: Proceeding of PKDD Workshop on Symbolic Data Analysis: Theory, Software and Applications for Knowledge Mining (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Do, TN., Poulet, F. (2009). Kernel-Based Algorithms and Visualization for Interval Data Mining. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-88067-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88066-0
Online ISBN: 978-3-540-88067-7
eBook Packages: EngineeringEngineering (R0)