Abstract
This paper presents a comparative study of methods for clustering long-term temporal data. We split a clustering procedure into two processes: similarity computation and grouping. As similarity computation methods, we employed dynamic time warping (DTW) and multiscale matching. As grouping methods, we employed conventional agglomerative hierarchical clustering (AHC) and rough sets-based clustering (RC). Using various combinations of these methods, we performed clustering experiments of the hepatitis data set and evaluated validity of the results. The results suggested that (1) complete-linkage (CL) criterion outperformed average-linkage (AL) criterion in terms of the interpret-ability of a dendrogram and clustering results, (2) combination of DTW and CL-AHC constantly produced interpretable results, (3) combination of DTW and RC would be used to find the core sequences of the clusters, (4) multiscale matching may suffer from the treatment of ’no-match’ pairs, however, the problem may be eluded by using RC as a subsequent grouping method.
This work was supported in part by the Grant-in-Aid for Scientific Research on Priority Area (B)(No.759) “Implementation of Active Mining in the Era of Information Flood” by the Ministry of Education, Culture, Science and Technology of Japan.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Keogh, E.: Mining and Indexing Time Series Data. Tutorial at the, IEEE International Conference on Data Mining (2001)
Chu, S., Keogh, E., Hart, D., Pazzani, M.: Iterative Deepening Dynamic Time Warping for Time Series. In: proceedings of the second SIAM International Conference on Data Mining (2002)
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series.In: Proceedings of AAAI Workshop on Knowledge Discovery in Databases pp. 359-370 (1994)
Hirano, S., Tsumoto, S.: Mining Similar Temporal Patterns in Long Time-series Data and Its Application to Medicine. In: Proceedings of the IEEE 2002 International Conference on Data Mining, pp. 219–226 (2002)
Ueda, N., Suzuki, S.: A Matching Algorithm of Deformed Planar Curves Using Multiscale Convex/Concave Structures. IEICE Transactions on Information and Systems J73-D-II(7), 992–1000 (1990)
Mokhtarian, F., Mackworth, A.K.: Scale-based Description and Recognition of planar Curves and Two Dimensional Shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-8(1), 24–43 (1986)
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Arnold Publishers (2001)
Hirano, S., Tsumoto, S.: An Indiscernibility-based Clustering Method with Iterative Refinement of Equivalence Relations. Journal of Advanced Computational Intelligence and Intelligent Informatics (2003) (in press)
Lowe, D.G.: Organization of Smooth Image Curves at Multiple Scales. International Journal of Computer Vision 3, 119–130 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsumoto, S., Hirano, S. (2004). A Comparative Study of Clustering Methods for Long Time-Series Medical Databases. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2004. Lecture Notes in Computer Science(), vol 3131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27774-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-27774-3_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22555-3
Online ISBN: 978-3-540-27774-3
eBook Packages: Springer Book Archive