Abstract
In many classification and data-mining applications the user does not know a priori which distance measure is the most appropriate for the task at hand without examining the produced results. Also, in several cases, different distance functions can provide diverse but equally intuitive results (according to the specific focus of each measure). In order to address the above issues, we elaborate on the construction of a hybrid index structure that supports query-by-example on shape and structural distance measures, therefore lending enhanced exploratory power to the system user. The shape distance measure that the index supports is the ubiquitous Euclidean distance, while the structural distance measure that we utilize is based on important periodic features extracted from a sequence. This new measure is phase-invariant and can provide flexible sequence characterizations, loosely resembling the Dynamic Time Warping, requiring only a fraction of the computational cost of the latter. Exploiting the relationship between the Euclidean and periodic measure, the new hybrid index allows for powerful query processing, enabling the efficient answering of kNN queries on both measures in a single index scan. We envision that our system can provide a basis for fast tracking of correlated time-delayed events, with applications in data visualization, financial market analysis, machine monitoring/diagnostics and gene expression data analysis.
Chapter PDF
References
Fu, A., Chan, P., Cheung, Y.-L., Moon, Y.S.: Dynamic VP-Tree Indexing for N-Nearest Neighbor Search Given Pair-Wise Distances. The VLDB Journal (2000)
Ide, T., Inoue, K.: Knowledge discovery from heterogeneous dynamic systems using change-point correlations. In: Proc. of SDM (2005)
Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: A survey and empirical demonstration. In: Proc. of SIGKDD (2002)
Keogh, E., Lonardi, S., Ratanamahatana, A.: Towards parameter-free data mining. In: Proc. of SIGKDD (2004)
Rafiei, D., Mendelzon, A.: On Similarity-Based Queries for Time Series Data. In: In Proc. of FODO (1998)
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In: Proc. of SIGKDD (2003)
Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D.: Identification of Similarities, Periodicities & Bursts for Online Search Queries. In: Proc. of SIGMOD (2004)
Vlachos, M., Yu, P., Castelli, V.: On periodicity detection and structural periodic similarity. In: SIAM Datamining (2005)
Xiong, Y., Yeung, D.-Y.: Time series clustering with arma mixtures. Pattern Recognition 37(8), 1675–1689 (2004)
Yi, B.-K., Faloutsos, C.: Fast Time Sequence Indexing for Arbitrary Lp Norms. In: Proceedings of VLDB, Cairo Egypt (September 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vlachos, M., Vagena, Z., Castelli, V., Yu, P.S. (2005). A Multi-metric Index for Euclidean and Periodic Matching. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_36
Download citation
DOI: https://doi.org/10.1007/11564126_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)