Skip to main content

Preserving Privacy in Time Series Data Classification by Discretization

  • Conference paper
Book cover Machine Learning and Data Mining in Pattern Recognition (MLDM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5632))

Abstract

In this paper, we propose discretization-based schemes to preserve privacy in time series data mining. Traditional research on preserving privacy in data mining focuses on time-invariant privacy issues. With the emergence of time series data mining, traditional snapshot-based privacy issues need to be extended to be multi-dimensional with the addition of time dimension. In this paper, we defined three threat models based on trust relationship between the data miner and data providers. We propose three different schemes for these three threat models. The proposed schemes are extensively evaluated against public-available time series data sets [1]. Our experiments show that proposed schemes can preserve privacy with cost of reduction in mining accuracy. For most data sets, proposed schemes can achieve low privacy leakage with slight reduction in classification accuracy. We also studied effect of parameters of proposed schemes in this paper.

This work was partly supported by the National Science Foundation under Grant No. CNS-0716527. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The ucr time series classification/clustering homepage (2006), http://www.cs.ucr.edu/~eamonn/time_series_data/

  2. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)

    Google Scholar 

  3. Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: SIGKDD, pp. 217–228 (2002)

    Google Scholar 

  4. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD 2002: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 279–288. ACM, New York (2002)

    Google Scholar 

  5. Zhang, N., Zhao, W.: Privacy-preserving data mining systems. Computer 40(4), 52–58 (2007)

    Article  Google Scholar 

  6. Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, Washington, DC, USA, pp. 217–228. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  7. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 277–286. ACM, New York (2006)

    Google Scholar 

  8. Fung, B.C.M., Wang, K.: Anonymizing classification data for privacy preservation. IEEE Trans. on Knowl. and Data Eng. 19(5), 711–725 (2007); Fellow-Philip S. Yu

    Article  Google Scholar 

  9. Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: SIGKDD, pp. 505–510 (2003)

    Google Scholar 

  10. Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)

    Google Scholar 

  11. Zhu, Y., Liu, L.: Optimal randomization for privacy preserving data mining. In: SIGKDD, pp. 761–766 (2004)

    Google Scholar 

  12. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  13. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: SIGKDD, pp. 639–644 (2002)

    Google Scholar 

  14. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9) (2004)

    Google Scholar 

  15. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: SIGKDD, pp. 206–215 (2003)

    Google Scholar 

  16. Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: SIGKDD, pp. 593–599 (2005)

    Google Scholar 

  17. Wright, R.N., Yang, Z.: Privacy-preserving bayesian network structure computation on distributed heterogeneous data. In: SIGKDD, pp. 713–718 (2004)

    Google Scholar 

  18. da Silva, J.C., Klusch, M.: Privacy-preserving discovery of frequent patterns in time series. In: Industrial Conference on Data Mining, pp. 318–328 (2007)

    Google Scholar 

  19. Papadimitriou, S., Li, F., Kollios, G., Yu, P.S.: Time series compressibility and privacy. In: VLDB 2007: Proceedings of the 33rd international conference on Very large data bases, pp. 459–470. VLDB Endowment (2007)

    Google Scholar 

  20. Zhu, Y., Fu, Y., Fu, H.: On privacy in time series data mining. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS, vol. 5012, pp. 479–493. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  21. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)

    Book  MATH  Google Scholar 

  22. Ingber, L.: Adaptive simulated annealing (asa). Technical report, Pasadena, CA (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhu, Y., Fu, Y., Fu, H. (2009). Preserving Privacy in Time Series Data Classification by Discretization. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03070-3_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03069-7

  • Online ISBN: 978-3-642-03070-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics