Clustering Mixed Data Using Spherical Representaion
When the data is given as mixed data, that is, the attributes take the values in mixture of binary and continuous, a clustering method based on k-means algorithm has been discussed. The binary part is transformed into the directional data (spherical representation) by a weight transformation which is induced from the consideration of the similarity between binary objects and of the natural definition of descriptive measures. At the same time, the spherical representation of the continuous part is given by the use of multidimensional scaling on the sphere. Combining the binary part and continuous part, like the latitude and longitude, we obtained a spherical representation of mixed data. Using the descriptive measures on a sphere, we obtain the clustering algorithm for mixed data based on k-means method. Finally, the performance of this clustering is evaluated by actual data.
KeywordsSupport Vector Machine Directional Data Geodesic Curve Linear Discriminant Function Descriptive Measure
Unable to display preview. Download preview PDF.
- 1.MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Le Can, L.M., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)Google Scholar
- 3.UCI Machine Learning Information / The Machine Learning Database Repository, http://www.ics.uci.edu/~mlearn/