Skip to main content

From Collaborative to Privacy-Preserving Sequential Pattern Mining

  • Chapter
  • First Online:
Privacy and Anonymity in Information Management Systems

Abstract

Research in the areas of privacy-preserving techniques in databases and subsequently in privacy enhancement technologies has witnessed an explosive growth spurt in recent years. This escalation has been fueled primarily by the growing mistrust of individuals toward organizations collecting and disbursing their personally identifiable information (PII). Digital repositories have become increasingly susceptible to intentional or unintentional abuse, resulting in organizations to be liable under the privacy legislations that are increasingly being adopted by governments the world over. These privacy concerns have necessitated new advancements in the field of distributed data mining, wherein collaborating parties may be legally bound not to reveal the private information of their customers. In this chapter, first we present the sequential pattern discovery problem in a collaborative framework and subsequently enhance the architecture by introducing the context of privacy. Thus we propose to extract sequential patterns from distributed databases while preserving privacy. A salient feature of the proposal is its flexibility and as a result is more pertinent to mining operations for real-world applications in terms of efficiency and functionality. Furthermore, under some reasonable assumptions, we prove that the architecture and protocol employed by our algorithm for multi-party computation is secure. Finally, we conclude with some trends of current research being conducted in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal R., Imielinski T., and Swami A. Mining association rules between sets of items in large database. In: 5th ACM SIGACT-SIGMOD Symposium on Principles of Database, SIGMOD 1993, Washington, D.C., USA, pp. 207–216, 1993.

    Google Scholar 

  2. Agrawal R. and Srikant R. Mining sequential patterns. In: 21st International Conference on Data Engineering, ICDE 2005, Tokyo, Japan, pp. 3–14, 1995.

    Google Scholar 

  3. Ayres J., Flannick J., Gehrke J., and Yiu T. Sequential pattern mining using bitmap representation. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, Edmonton, Alberta, Canada, pp. 439–435, 2002.

    Google Scholar 

  4. Bhattacharya J., Gupta S., Kapoor V., and Dass R. Utilizing network features for privacy violation detection. In: 1st International Conference Communication System Software and Middleware, COMSWARE 2006, Delhi, India, January 2006.

    Google Scholar 

  5. Chaum D., Crepeau C., and Damgard I. Multiparty unconditionally secure protocols. In 20th Annual Symposium on the Theory of Computing, STOC 1988, Chicago, Illinois, USA, pp. 11–19, 1988.

    Google Scholar 

  6. Clifton C., Kantarcioglu M., Lin X., Vaidya J., and M. Zhu. Tools for privacy preserving distributed data mining. SIGKDD Explorations, 4(2):28–34, 2003.

    Article  Google Scholar 

  7. Du W. and Atallah M.J. Secure multi-party computation problems and their applications: A review and open problems. In: New Security Paradigms Workshop, Cloudcroft, USA, pp. 11–20, 2001.

    Google Scholar 

  8. Du W., Han Y., and Chen S. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: 4th SIAM International Conference on Data Mining, SDM 2004, pp. 222–233, 2004.

    Google Scholar 

  9. Evfimievski A., Srikant R., Agrawal R., and Gehrke J. Privacy-preserving mining of association rules. Information Systems, 29(4), pp. 343–364, June 2004.

    Article  Google Scholar 

  10. Goldreich O. Secure multi-party computation. Working Draft, 2000.

    Google Scholar 

  11. Gouda K., Hassaan M., and Zaki M.J. Prism: A primal-encoding approach for frequent sequence mining. In: 7th IEEE International Conference on Data Mining, ICDM 2007, Omaha NE, USA, pp. 487–492, 2007.

    Google Scholar 

  12. Han J., Pei J., Mortazavi-asl B., Chen Q., Dayal U., and M. Hsu. Freespan: Frequent pattern-projected sequential pattern mining. In 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, San Jose, California, USA, pp. 355–359, 2000.

    Google Scholar 

  13. Huang Q.-H. Privacy preserving sequential pattern mining in data stream. In 4th International Conference on Intelligent Computing, ICIC 2008, Shanghai, China, 2008.

    Google Scholar 

  14. Jagannathan G. and Wright R. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2005, Chicago, IL, USA, August 2005.

    Google Scholar 

  15. Kantarcioglu M. and Vaidya J. An architecture for privacy-preserving mining of client information. In IEEE International Conference on Privacy, Security and Data Mining in Conjunction with the 2002 IEEE ICDM Conference, CRPIT 2002, 2002.

    Google Scholar 

  16. Kapoor V., Poncelet P., Trousset F., and Teisseire M. Privacy preserving sequential pattern mining in distributed databases. In: 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, Arlington, USA, 2006.

    Google Scholar 

  17. Lindell Y. and Pinkas B. Privacy preserving data mining. Journal of Cryptology, 15(2), pp. 11–19, 2002.

    MathSciNet  Google Scholar 

  18. Masseglia F., Cathala F., and Poncelet P. The PSP approach for mining sequential patterns. In: 2nd European Symposium of Principles of Data Mining and Knowledge Discovery, PKDD 1998, Nantes, France, 1998.

    Google Scholar 

  19. Pei J., Han J., Pinto H., Chen Q., Dayal U., and Hsu M. Prefixspan: Mining sequential patterns efficiently by prefix projected pattern growth. In: 17th International Conference on Data Engineering, ICDE 2001, Heidelberg, Germany, 2001.

    Google Scholar 

  20. Pei J., Han J., and Wang W. Mining sequential patterns with constraints in large databases. In: ACM CIKM International Conference on Information and Knowledge Management, CIKM 2002, MCLean, USA, pp. 18–25, 2002.

    Google Scholar 

  21. Pinkas B. Cryptographic techniques for privacy preserving data mining. SIGKDD Explorations, 4(2), pp. 18–25, Edmonton, Alberta, Canada, 2002.

    Article  Google Scholar 

  22. Raissi C. and Poncelet P. Sampling for sequential pattern mining: From static databases to data streams. In: 7th IEEE International Conference on Data Mining, ICDM 2007, Maebashi City, Japan, pp. 631–636, 2007.

    Google Scholar 

  23. Seung-Woo K., Sanghyun P., Jung-Im W., and Sang-Wook K. Privacy preserving data mining of sequential patterns for network traffic data. Information Sciences: An International Journal, 178(3):694–713, 2008.

    Article  Google Scholar 

  24. Srikant R. and Agrawal R. Mining sequential patterns: Generalizations and performance improvements. In 5th International Conference on Extending Database Technology, EDBT 1996, Avignon, France, pp. 3–17, 1996.

    Google Scholar 

  25. U. S. D. of Health & Human Services. Health insurance portability and accountability act of 1996. http://www.hipaa.org/, August 1996.

  26. Vaidya J. and Clifton C. Privacy preserving association rule mining in vertically partitioned data. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, Edmonton, Alberta, Canada, July 2002.

    Google Scholar 

  27. Wu X. and Zhang S. Synthesizing high-frequency rules from different data sources. IEEE Transctions on Knowledge and Data Engineering, 15(2):353–367, 2003.

    Article  Google Scholar 

  28. Yan X., Han J., and Afshar R. Clospan: Mining closed sequential patterns in large datasets. In: 3rd SIAM International Conference on Data Mining, SDM 2004, pp. 166–177, 2003.

    Google Scholar 

  29. Yao A.C. Protocols for secure computations. In: 23rd Annual IEEE Symposium on Foundations of Computer Science, SFCS 1982, pp. 160–164, 1982.

    Google Scholar 

  30. Yao A.C. How to generate and exchange secrets. In: 27th Symposium on Foundations of Computer Science, FOCS 1986, pp. 162–167, 1986.

    Google Scholar 

  31. Zaki M. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning Journal, 42(1):31–60, February 2001.

    Article  MATH  Google Scholar 

  32. Zhan J., Chang L., and Matwin S. Privacy-preserving collaborative sequential pattern mining. In: Workshop on Link Analysis, Counter-terrorism, and Privacy in Conjunction with SIAM Interational Conference on Data Mining, Lake Buena Vista, Florida, pp. 61–72, 2004.

    Google Scholar 

  33. Zhong N., Yao Y., and Ohsuga S. Peculiarity oriented multi-database mining. In: 3rd European Symposium of Principles of Data Mining and Knowledge Discovery, PKDD 1999, pp. 136–146, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Kapoor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer London

About this chapter

Cite this chapter

Kapoor, V., Poncelet, P., Trousset, F., Teisseire, M. (2010). From Collaborative to Privacy-Preserving Sequential Pattern Mining. In: Nin, J., Herranz, J. (eds) Privacy and Anonymity in Information Management Systems. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84996-238-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-84996-238-4_7

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84996-237-7

  • Online ISBN: 978-1-84996-238-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics