Dimension Reduction in Dissimilarity Spaces for Time Series Classification

Jain, Brijnesh; Spiegel, Stephan

doi:10.1007/978-3-319-44412-3_3

Brijnesh Jain¹⁶ &
Stephan Spiegel¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9785))

Included in the following conference series:

International Workshop on Advanced Analysis and Learning on Temporal Data

831 Accesses
1 Citations

Abstract

Time series classification in the dissimilarity space combines the advantages of elastic dissimilarity functions such as the dynamic time warping distance and the rich mathematical structure of Euclidean spaces. We applied dimension reduction using PCA followed by support vector learning on dissimilarity representations to 42 UCR datasets. The results suggest that time series classification in dissimilarity space has potential to complement the state-of-the-art, because the SVM classifiers perform better on the 42 datasets with higher confidence than the nearest-neighbor classifier based on the dynamic time warping distance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Batista, G.E., Wang, X., Keogh, E.J.: A complexity-invariant distance measure for time series. In: SIAM International Conference on Data Mining, vol. 11, pp. 699–710 (2011)
Google Scholar
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)
Google Scholar
Bunke, H., Riesen, K.: Graph classification based on dissimilarity space embedding. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) Structural, Syntactic, and Statistical Pattern Recognition. LNCS, vol. 5342, pp. 996–1007. Springer, Heidelberg (2008)
Chapter Google Scholar
Cao, L.J., Chua, K.S., Chong, W.K., Lee, H.P., Gu, Q.M.: A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine. Neurocomputing 55(1–2), 321–336 (2003)
Google Scholar
Chen, Y., Garcia, E., Gupta, M., Rahimi, A., Cazzanti, L.: Similarity-based classification: concepts and algorithms. J. Mach. Learn. Res. 10, 747–776 (2009)
MathSciNet MATH Google Scholar
Cortes, C., Vapnik, V.: Support-vector network. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Article MathSciNet MATH Google Scholar
Duin, R., de Ridder, D., Tax, D.: Experiments with object based discriminant functions; a featureless approach to pattern recognition. Pattern Recogn. Lett. 18(11–13), 1159–1166 (1997)
Article Google Scholar
Duin, R.P.W., Pekalska, E.: The dissimilarity space: bridging structural and statistical pattern recognition. Pattern Recogn. Lett. 33(7), 807–962 (2012)
Article Google Scholar
Fu, T.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
Article Google Scholar
Geibel, P., Jain, B., Wysotzki, F.: SVM learning with the SH inner product. In: European Symposium on Artificial Neural Networks (2004)
Google Scholar
Geurts, P.: Pattern extraction for time series classification. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 115–127. Springer, Heidelberg (2001)
Chapter Google Scholar
Graepel, T., Herbrich, R., Bollmann-Sdorra, P., Obermayer, K.: Classification on pairwise proximity data. In: Advances in Neural Information Processing Systems (1999)
Google Scholar
Graepel, T., Herbrich, R., Schölkopf, B., Smola, A., Bartlett, P., Müller, K.-R., Obermayer, K., Williamson, R.: Classification on proximity data with LP-machines. In: International Conference on Artificial Neural Networks (1999)
Google Scholar
Gudmundsson, S., Runarsson, T.P., Sigurdsson, S.: Support vector machines and dynamic time warping for time series. In: Joint Conference on Neural Networks (2008)
Google Scholar
Haasdonk, H., Burkhardt, B.: Invariant kernels for pattern analysis and machine learning. Mach. Learn. 68, 35–61 (2007)
Article Google Scholar
Hochreiter, S., Obermayer, K.: Support vector machines for dyadic data. Neural Comput. 18(6), 1472–1510 (2006)
Article MathSciNet MATH Google Scholar
Jain, B.J., Geibel, P., Wysotzki, F.: SVM learning with the Schur? Hadamard inner product for graphs. Neurocomputing 64, 93–105 (2005)
Article Google Scholar
Jain, B.J., Spiegel, S.: Time series classification in dissimilarity spaces. In: Proceedings of the 1st International Workshop on Advanced Analytics and Learning on Temporal Data (2015)
Google Scholar
Jain, B.J.: Generalized gradient learning on time series. Mach. Learn. 100(2), 587–608 (2015)
Article MathSciNet MATH Google Scholar
Kate, R.J.: Using dynamic time warping distances as features for improved time series classification. Data Min. Knowl. Discov. 30(2), 283–312 (2016)
Article MathSciNet Google Scholar
Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR Time Series Classification/Clustering Homepage (2011). www.cs.ucr.edu/~eamonn/time_series_data/
Laub, J., Müller, K.R.: Feature discovery in non-metric pairwise data. J. Mach. Learn. Res. 5, 801–818 (2004)
MathSciNet MATH Google Scholar
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
Lines, J., Bagnall, A.: Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov. 29(3), 565–592 (2015)
Article MathSciNet Google Scholar
Livi, L., Rizzi, A., Sadeghian, A.: Optimized dissimilarity space embedding for labeled graphs. Inf. Sci. 266, 47–64 (2014)
Article MathSciNet Google Scholar
Ong, C., Mary, X., Canu, S., Smola, A.J.: Learning with non-positive kernels. In: International Conference on Machine Learning (2004)
Google Scholar
Pekalska, E., Duin, R.P.W.: The Dissimilarity Representation for Pattern Recognition. World Scientific, River Edge (2005)
MATH Google Scholar
Pekalska, E., Duin, R.P.W., Paclik, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recogn. 39(2), 189–208 (2006)
Article MATH Google Scholar
Petitjean, F., Ketterlin, A., Gançarski, P.: A global averaging method for dynamic time warping, with applications to clustering. Pattern Recogn. 44(3), 678–693 (2011)
Article MATH Google Scholar
Riesen, K., Neuhaus, M., Bunke, H.: Graph embedding in vector spaces by means of prototype selection. In: Escolano, F., Vento, M. (eds.) GbRPR. LNCS, vol. 4538, pp. 383–393. Springer, Heidelberg (2007)
Chapter Google Scholar
Riesen, K., Bunke, H.: Graph classification based on vector space embedding. Int. J. Pattern Recogn. Artif. Intell. 23(6), 1053–1081 (2009)
Article MathSciNet MATH Google Scholar
Spillmann, B., Neuhaus, M., Bunke, H., Pękalska, E., Duin, R.P.W.: Transforming strings to vector spaces using prototype selection. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) Structural, Syntactic, and Statistical Pattern Recognition. LNCS, vol. 4109, pp. 287–296. Springer, Heidelberg (2006)
Chapter Google Scholar
Subasi, A., Gursoy, M.I.: EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl. 37(12), 8659–8666 (2010)
Article Google Scholar
Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.A.: Fast time series classification using numerosity reduction. In: International Conference on Machine Learning (2006)
Google Scholar
Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. ACM SIGKDD Explor. Newslett. 12(1), 40–48 (2010)
Article Google Scholar

Download references

Acknowledgements

B. Jain was funded by the DFG Sachbeihilfe JA 2109/4-1.

Author information

Authors and Affiliations

Electrical Engineering and Computer Science, Technische Universität Berlin, Fak. IV, TEL 14, Ernst-Reuter Platz 7, 10587, Berlin, Germany
Brijnesh Jain & Stephan Spiegel

Authors

Brijnesh Jain
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Spiegel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brijnesh Jain .

Editor information

Editors and Affiliations

Lab. d'Informatique de Grenoble, Université Grenoble, Grenoble, France
Ahlame Douzal-Chouakria
Universidade da Coruna, Coruna, Spain
José A. Vilar
IRISA, Université de Bretagne-Sud, Vannes, France
Pierre-François Marteau

A Performance Profiles

Performance profiles have been introduced by Dolan to compare the efficiency of algorithms [7]. Here, we use performance profiles to compare differences in the classification accuracy of a collection of classifiers on a set of classification problems. The comparison is summarized by one curve per classifier, which is easier to read than a table of classification accuracies.

To define a performance profile, we assume that $\mathbb {C}$ is a set of classifiers to be compared and $\mathbb {P}$ is the set of all classification problems. For each classification problem $p \in \mathbb {P}$ and each classifier $c \in \mathbb {C}$, we define

$$ \rho _{c,p} = \text {accuracy of classifier } c \in \mathbb {C} \text { on problem } p \in \mathbb {P} $$

as the performance of classifier c for problem p. In performance profiles, we do not consider the absolute performance of a classifier in terms of its classification accuracy, but its relative performance with respect to the best performing classifier. The classifier with the best performance on problem p has classification accuracy

$$ \rho _p^* = \max \mathop {\left\{ \rho _{\kappa , p} \,:\, \kappa \in \mathbb {C} \right\} }. $$

Then the relative performance of classifier c on problem p is given by

$$ r_{c, p} = 1-\frac{\rho _{c, p}}{\rho _p^*}. $$

The relative performance $r_{c, p}$ takes values from the interval [0, 1]. The better the performance of a classifier for a given problem, the lower is its relative performance. Thus, the lower the relative performance, the better the classifier. Moreover, from

$$ r_{c, p} \cdot \rho _p^* = \mathop {\left( 1-\frac{\rho _{c, p}}{\rho _p^*} \right) }\rho _p^* = \rho _p^* -\rho _{c, p} $$

follows that $r_{c,p}$ is the factor by which the classification accuracy $\rho _{c,p}$ deviates from the best classification accuracy $\rho _p^*$.

Finally, the performance profile of classifier $c \in \mathbb {C}$ over all problems $p \in \mathbb {P}$ is an empirical cumulative distribution function

$$ P_c(\tau ) = \frac{1}{\mathop {\left|\mathbb {P} \right|}} \mathop {\left|\mathop {\left\{ p \in \mathbb {P} \,:\, r_{c, p} \le \tau \right\} } \right|}. $$

It is sufficient to keep three three facts in mind to interpret performance profiles:

1.
The value $P_c(0)$ is the fraction of problems on which classifier c is best.
2.
$P_c(\tau )$ is the fraction of problems on which the performance of classifier c deviates at most by factor $\tau $ from the best performance.
3.
$\tau _{\max }$ with $P_c(\tau _{\max }) = 1$ is the maximum factor by which classifier c deviates from the best performance.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jain, B., Spiegel, S. (2016). Dimension Reduction in Dissimilarity Spaces for Time Series Classification. In: Douzal-Chouakria, A., Vilar, J., Marteau, PF. (eds) Advanced Analysis and Learning on Temporal Data. AALTD 2015. Lecture Notes in Computer Science(), vol 9785. Springer, Cham. https://doi.org/10.1007/978-3-319-44412-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-44412-3_3
Published: 04 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44411-6
Online ISBN: 978-3-319-44412-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dimension Reduction in Dissimilarity Spaces for Time Series Classification

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Performance Profiles

A Performance Profiles

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation