Abstract
In this paper, we propose discretization-based schemes to preserve privacy in time series data mining. Traditional research on preserving privacy in data mining focuses on time-invariant privacy issues. With the emergence of time series data mining, traditional snapshot-based privacy issues need to be extended to be multi-dimensional with the addition of time dimension. In this paper, we defined three threat models based on trust relationship between the data miner and data providers. We propose three different schemes for these three threat models. The proposed schemes are extensively evaluated against public-available time series data sets [1]. Our experiments show that proposed schemes can preserve privacy with cost of reduction in mining accuracy. For most data sets, proposed schemes can achieve low privacy leakage with slight reduction in classification accuracy. We also studied effect of parameters of proposed schemes in this paper.
This work was partly supported by the National Science Foundation under Grant No. CNS-0716527. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The ucr time series classification/clustering homepage (2006), http://www.cs.ucr.edu/~eamonn/time_series_data/
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)
Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: SIGKDD, pp. 217–228 (2002)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD 2002: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 279–288. ACM, New York (2002)
Zhang, N., Zhao, W.: Privacy-preserving data mining systems. Computer 40(4), 52–58 (2007)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, Washington, DC, USA, pp. 217–228. IEEE Computer Society, Los Alamitos (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 277–286. ACM, New York (2006)
Fung, B.C.M., Wang, K.: Anonymizing classification data for privacy preservation. IEEE Trans. on Knowl. and Data Eng. 19(5), 711–725 (2007); Fellow-Philip S. Yu
Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: SIGKDD, pp. 505–510 (2003)
Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)
Zhu, Y., Liu, L.: Optimal randomization for privacy preserving data mining. In: SIGKDD, pp. 761–766 (2004)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: SIGKDD, pp. 639–644 (2002)
Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9) (2004)
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: SIGKDD, pp. 206–215 (2003)
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: SIGKDD, pp. 593–599 (2005)
Wright, R.N., Yang, Z.: Privacy-preserving bayesian network structure computation on distributed heterogeneous data. In: SIGKDD, pp. 713–718 (2004)
da Silva, J.C., Klusch, M.: Privacy-preserving discovery of frequent patterns in time series. In: Industrial Conference on Data Mining, pp. 318–328 (2007)
Papadimitriou, S., Li, F., Kollios, G., Yu, P.S.: Time series compressibility and privacy. In: VLDB 2007: Proceedings of the 33rd international conference on Very large data bases, pp. 459–470. VLDB Endowment (2007)
Zhu, Y., Fu, Y., Fu, H.: On privacy in time series data mining. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS, vol. 5012, pp. 479–493. Springer, Heidelberg (2008)
Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)
Ingber, L.: Adaptive simulated annealing (asa). Technical report, Pasadena, CA (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, Y., Fu, Y., Fu, H. (2009). Preserving Privacy in Time Series Data Classification by Discretization. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)