Abstract
Research in the areas of privacy-preserving techniques in databases and subsequently in privacy enhancement technologies has witnessed an explosive growth spurt in recent years. This escalation has been fueled primarily by the growing mistrust of individuals toward organizations collecting and disbursing their personally identifiable information (PII). Digital repositories have become increasingly susceptible to intentional or unintentional abuse, resulting in organizations to be liable under the privacy legislations that are increasingly being adopted by governments the world over. These privacy concerns have necessitated new advancements in the field of distributed data mining, wherein collaborating parties may be legally bound not to reveal the private information of their customers. In this chapter, first we present the sequential pattern discovery problem in a collaborative framework and subsequently enhance the architecture by introducing the context of privacy. Thus we propose to extract sequential patterns from distributed databases while preserving privacy. A salient feature of the proposal is its flexibility and as a result is more pertinent to mining operations for real-world applications in terms of efficiency and functionality. Furthermore, under some reasonable assumptions, we prove that the architecture and protocol employed by our algorithm for multi-party computation is secure. Finally, we conclude with some trends of current research being conducted in the field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal R., Imielinski T., and Swami A. Mining association rules between sets of items in large database. In: 5th ACM SIGACT-SIGMOD Symposium on Principles of Database, SIGMOD 1993, Washington, D.C., USA, pp. 207–216, 1993.
Agrawal R. and Srikant R. Mining sequential patterns. In: 21st International Conference on Data Engineering, ICDE 2005, Tokyo, Japan, pp. 3–14, 1995.
Ayres J., Flannick J., Gehrke J., and Yiu T. Sequential pattern mining using bitmap representation. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, Edmonton, Alberta, Canada, pp. 439–435, 2002.
Bhattacharya J., Gupta S., Kapoor V., and Dass R. Utilizing network features for privacy violation detection. In: 1st International Conference Communication System Software and Middleware, COMSWARE 2006, Delhi, India, January 2006.
Chaum D., Crepeau C., and Damgard I. Multiparty unconditionally secure protocols. In 20th Annual Symposium on the Theory of Computing, STOC 1988, Chicago, Illinois, USA, pp. 11–19, 1988.
Clifton C., Kantarcioglu M., Lin X., Vaidya J., and M. Zhu. Tools for privacy preserving distributed data mining. SIGKDD Explorations, 4(2):28–34, 2003.
Du W. and Atallah M.J. Secure multi-party computation problems and their applications: A review and open problems. In: New Security Paradigms Workshop, Cloudcroft, USA, pp. 11–20, 2001.
Du W., Han Y., and Chen S. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: 4th SIAM International Conference on Data Mining, SDM 2004, pp. 222–233, 2004.
Evfimievski A., Srikant R., Agrawal R., and Gehrke J. Privacy-preserving mining of association rules. Information Systems, 29(4), pp. 343–364, June 2004.
Goldreich O. Secure multi-party computation. Working Draft, 2000.
Gouda K., Hassaan M., and Zaki M.J. Prism: A primal-encoding approach for frequent sequence mining. In: 7th IEEE International Conference on Data Mining, ICDM 2007, Omaha NE, USA, pp. 487–492, 2007.
Han J., Pei J., Mortazavi-asl B., Chen Q., Dayal U., and M. Hsu. Freespan: Frequent pattern-projected sequential pattern mining. In 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, San Jose, California, USA, pp. 355–359, 2000.
Huang Q.-H. Privacy preserving sequential pattern mining in data stream. In 4th International Conference on Intelligent Computing, ICIC 2008, Shanghai, China, 2008.
Jagannathan G. and Wright R. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2005, Chicago, IL, USA, August 2005.
Kantarcioglu M. and Vaidya J. An architecture for privacy-preserving mining of client information. In IEEE International Conference on Privacy, Security and Data Mining in Conjunction with the 2002 IEEE ICDM Conference, CRPIT 2002, 2002.
Kapoor V., Poncelet P., Trousset F., and Teisseire M. Privacy preserving sequential pattern mining in distributed databases. In: 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, Arlington, USA, 2006.
Lindell Y. and Pinkas B. Privacy preserving data mining. Journal of Cryptology, 15(2), pp. 11–19, 2002.
Masseglia F., Cathala F., and Poncelet P. The PSP approach for mining sequential patterns. In: 2nd European Symposium of Principles of Data Mining and Knowledge Discovery, PKDD 1998, Nantes, France, 1998.
Pei J., Han J., Pinto H., Chen Q., Dayal U., and Hsu M. Prefixspan: Mining sequential patterns efficiently by prefix projected pattern growth. In: 17th International Conference on Data Engineering, ICDE 2001, Heidelberg, Germany, 2001.
Pei J., Han J., and Wang W. Mining sequential patterns with constraints in large databases. In: ACM CIKM International Conference on Information and Knowledge Management, CIKM 2002, MCLean, USA, pp. 18–25, 2002.
Pinkas B. Cryptographic techniques for privacy preserving data mining. SIGKDD Explorations, 4(2), pp. 18–25, Edmonton, Alberta, Canada, 2002.
Raissi C. and Poncelet P. Sampling for sequential pattern mining: From static databases to data streams. In: 7th IEEE International Conference on Data Mining, ICDM 2007, Maebashi City, Japan, pp. 631–636, 2007.
Seung-Woo K., Sanghyun P., Jung-Im W., and Sang-Wook K. Privacy preserving data mining of sequential patterns for network traffic data. Information Sciences: An International Journal, 178(3):694–713, 2008.
Srikant R. and Agrawal R. Mining sequential patterns: Generalizations and performance improvements. In 5th International Conference on Extending Database Technology, EDBT 1996, Avignon, France, pp. 3–17, 1996.
U. S. D. of Health & Human Services. Health insurance portability and accountability act of 1996. http://www.hipaa.org/, August 1996.
Vaidya J. and Clifton C. Privacy preserving association rule mining in vertically partitioned data. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, Edmonton, Alberta, Canada, July 2002.
Wu X. and Zhang S. Synthesizing high-frequency rules from different data sources. IEEE Transctions on Knowledge and Data Engineering, 15(2):353–367, 2003.
Yan X., Han J., and Afshar R. Clospan: Mining closed sequential patterns in large datasets. In: 3rd SIAM International Conference on Data Mining, SDM 2004, pp. 166–177, 2003.
Yao A.C. Protocols for secure computations. In: 23rd Annual IEEE Symposium on Foundations of Computer Science, SFCS 1982, pp. 160–164, 1982.
Yao A.C. How to generate and exchange secrets. In: 27th Symposium on Foundations of Computer Science, FOCS 1986, pp. 162–167, 1986.
Zaki M. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning Journal, 42(1):31–60, February 2001.
Zhan J., Chang L., and Matwin S. Privacy-preserving collaborative sequential pattern mining. In: Workshop on Link Analysis, Counter-terrorism, and Privacy in Conjunction with SIAM Interational Conference on Data Mining, Lake Buena Vista, Florida, pp. 61–72, 2004.
Zhong N., Yao Y., and Ohsuga S. Peculiarity oriented multi-database mining. In: 3rd European Symposium of Principles of Data Mining and Knowledge Discovery, PKDD 1999, pp. 136–146, 1999.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer London
About this chapter
Cite this chapter
Kapoor, V., Poncelet, P., Trousset, F., Teisseire, M. (2010). From Collaborative to Privacy-Preserving Sequential Pattern Mining. In: Nin, J., Herranz, J. (eds) Privacy and Anonymity in Information Management Systems. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84996-238-4_7
Download citation
DOI: https://doi.org/10.1007/978-1-84996-238-4_7
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-84996-237-7
Online ISBN: 978-1-84996-238-4
eBook Packages: Computer ScienceComputer Science (R0)