Abstract
Predictive knowledge discovery is an important knowledge acquisition method. It is also used in the clustering process of data mining. Visualization is very helpful for high dimensional data analysis, but not precise and this limits its usability in quantitative cluster analysis. In this paper, we adopt a visual technique called HOV3 to explore and verify clustering results with quantified measurements. With the quantified contrast between grouped data distributions produced by HOV3, users can detect clusters and verify their validity efficiently.
The datasets used in this paper are available from http://www.ics.uci.edu/~mlearn/Machine-Learning.html.
Chapter PDF
Similar content being viewed by others
References
Ankerst, M., Breunig, M.M., Kriegel, S.H.P.J.: OPTICS: Ordering points to identify the clustering structure. In: Proc. of ACM SIGMOD Conference, pp. 49–60. ACM Press, New York (1999)
Ankerst, M., Keim, D.: Visual Data Mining and Exploration of Large Databases. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, Springer, Heidelberg (2001)
Berkhin, P.: A Survey of Clustering Data Mining Techniques. In: Jacob, K., Charles, N., Marc, T. (eds.) Grouping Multidimensional Data, pp. 25–72. Springer, Heidelberg (2006)
Chen, K., Liu, L.: iVIBRATE: Interactive visualization-based framework for clustering large datasets. ACM Transactions on Information Systems (TOIS) 24(2), 245–294 (2006)
Faloutsos, C., Lin, K.: Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia data sets. In: Proc. of ACM-SIGMOD, pp. 163–174 (1995)
Fleming, W.: Functions of Several Variables. In: Gehring, F.W., Halmos, P.R. (eds.) 2nd edn. Springer, Heidelberg (1977)
Huang, Z., Cheung, D.W., Ng, M.K.: An Empirical Study on the Visual Cluster Validation Method with Fastmap. In: Proc. of DASFAA 2001, pp. 84–91 (2001)
Huang, Z., Lin, T.: A visual method of cluster validation with Fastmap. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 153–164. Springer, Heidelberg (2000)
Jain, A., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Jolliffe Ian, T.: Principal Component Analysis. Springer, Heidelberg (2002)
Kandogan, E.: Visualizing multi-dimensional clusters, trends, and outliers using star coordinates. In: Proc. of ACM SIGKDD Conference, pp. 107–116. ACM Press, New York (2001)
Kohonen, T.: Self-Organizing Maps, 2nd edn. Springer, Berlin (1997)
Kaski, S., Sinkkonen, J., Peltonen, J.: Data Visualization and Analysis with Self-Organizing Maps in Learning Metrics. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 162–173. Springer, Heidelberg (2001)
Kruskal, J.B., Wish, M.: Multidimensional Scaling. SAGE university paper series on quantitive applications in the social sciences, pp. 7–11. Sage Publications, CA (1978)
Oliveira, M.C., Levkowitz, H.: From Visual Data Exploration to Visual Data Mining: A Survey. IEEE Transaction on Visualization and Computer Graphs 9(3), 378–394 (2003)
Pampalk, E., Goebl, W., Widmer, G.: Visualizing Changes in the Structure of Data for Exploratory Feature Selection. In: SIGKDD 2003, Washington, DC, USA (2003)
Sprenger, T.C., Brunella, R., Gross, M.H.: H-BLOB: A Hierarchical Visual Clustering Method Using Implicit Surfaces. In: Proc. of the conference on Visualization 2000, pp. 61–68. IEEE Computer Society Press, Los Alamitos (2000)
Seo, J., Shneiderman, B.: From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments. In: Hemmje, M., Niederée, C., Risse, T. (eds.) From Integrated Publication and Information Systems to Information and Knowledge Environments. LNCS, vol. 3379, Springer, Heidelberg (2005)
Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 17–28. Springer, Heidelberg (2001)
Weiss, S.M., Indurkhya, N.: Predictive Data Mining: A Practical Guide. Morgan Kaufmann Publishers, San Francisco (1998)
Vilalta, R., Stepinski, T., Achari, M.: An Efficient Approach to External Cluster Assessment with an Application to Martian Topography, Technical Report, No. UH-CS-05-08, Department of Computer Science, University of Houston (2005)
Zhang, K.-B., Orgun, M.A., Zhang, K.: HOV3, An Approach for Cluster Analysis. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 317–328. Springer, Heidelberg (2006)
Zhang, K.-B., Orgun, M.A., Zhang, K.: A Visual Approach for External Cluster Validation. In: CIDM 2007. Proc. of IEEE Symposium on Computational Intelligence and Data Mining, Honolulu, Hawaii, USA, April 1-5, 2007, pp. 576–582. IEEE Press, Los Alamitos (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, KB., Orgun, M.A., Zhang, K. (2007). A Prediction-Based Visual Approach for Cluster Exploration and Cluster Validation by HOV3 . In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds) Knowledge Discovery in Databases: PKDD 2007. PKDD 2007. Lecture Notes in Computer Science(), vol 4702. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74976-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-74976-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74975-2
Online ISBN: 978-3-540-74976-9
eBook Packages: Computer ScienceComputer Science (R0)