A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data

Vaidya, Jaideep

doi:10.1007/978-0-387-70992-5_14

Jaideep Vaidya⁵

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

5024 Accesses
28 Citations

The goal of data mining is to extract or “mine” knowledge from large amounts of data. However, data is often collected by several different sites. Privacy, legal and commercial concerns restrict centralized access to this data, thus derailing data mining projects. Recently, there has been growing focus on finding solutions to this problem. Several algorithms have been proposed that do distributed knowledge discovery, while providing guarantees on the non-disclosure of data. Vertical partitioning of data is an important data distribution model often found in real life. Vertical partitioning or heterogeneous distribution implies that different features of the same set of data are collected by different sites. In this chapter we survey some of the methods developed in the literature to mine vertically partitioned data without violating privacy and discuss challenges and complexities specific to vertical partitioning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Murat Kantarcioglu. A survey of Privacy-Preserving Methods across Horizontall Partitioned Data. Privacy-Preserving Data Mining: Models and Algorithms. Ed. Charu Aggarwal, Philip Yu, Springer, 2008.
Google Scholar
Rakesh Agrawal, Alexandre Evfimievski, and Ramakrishnan Srikant. Information sharing across private databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, San Diego, California, June 9-12 2003.
Google Scholar
Daniel Barbará, Ningning Wu, and Sushil Jajodia. Detecting novel network intrusions using bayes estimators. In First SIAM International Conference on Data Mining, Chicago, Illinois, April 5-7 2001.
Google Scholar
Vic Barnett and Toby Lewis. Outliers in Statistical Data. John Wiley and Sons, 3rd edition, 1994.
Google Scholar
Christian Cachin. Efficient private bidding and auctions with an oblivious third party. In Proceedings of the 6th ACM conference on Computer and communications security, pages 120–127. ACM Press, 1999.
Google Scholar
Gregory F. Cooper and Edward Herskovits. A bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9(4):309–347, 1992.
MATH Google Scholar
Wenliang Du and Mikhail J. Atallah. Privacy-preserving statistical analysis. In Proceeding of the 17th Annual Computer Security Applications Conference, New Orleans, Louisiana, USA, December 10-14 2001.
Google Scholar
Wenliang Du and Zhijun Zhan. Building decision tree classifier on private data. In Chris Clifton and Vladimir Estivill-Castro, editors, IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, volume 14, pages 1–8, Maebashi City, Japan, December 9 2002. Australian Computer Society.
Google Scholar
Directive 95/46/EC of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal of the European Communities, No I.(281):31–50, October 24 1995.
Google Scholar
Michael J. Freedman, Kobbi Nissim, and Benny Pinkas. Efficient private matching and set intersection. In Eurocrypt 2004, Interlaken, Switzerland, May 2-6 2004. International Association for Cryptologic Research (IACR).
Google Scholar
Bart Goethals, Sven Laur, Helger Lipmaa, and Taneli Mielikäinen. On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In Choonsik Park and Seongtaek Chee, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104–120, December 2–3, 2004.
Google Scholar
D. M. Hawkins. Identification of Outliers. Chapman and Hall, 1st edition, 1980.
Google Scholar
Standard for privacy of individually identifiable health information. Federal Register, 66(40), February 28 2001.
Google Scholar
Ioannis Ioannidis, Ananth Grama, and Mikhail Atallah. A secure protocol for computing dot-products in clustered and distributed environments. In The 2002 International Conference on Parallel Processing, Vancouver, British Columbia, August 18-21 2002.
Google Scholar
Geetha Jagannathan and Rebecca N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593–599, Chicago, IL, August 21-24 2005.
Google Scholar
Murat Kantarcıoǧlu and Chris Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering, 16(9):1026–1037, September 2004.
Article Google Scholar
Edwin M. Knorr and Raymond T. Ng. Algorithms for mining distance-based outliers in large datasets. In Proceedings of 24th International Conference on Very Large Data Bases (VLDB 1998), pages 392–403, New York City, NY, USA, August24-27 1998.
Google Scholar
Edwin M. Knorr, Raymond T. Ng, and Vladimir Tucakov. Distance-based outliers: algorithms and applications. The VLDB Journal, 8(3–4):237–253, 2000.
Article Google Scholar
Aleksandar Lazarevic, Aysel Ozgur, Levent Ertoz, Jaideep Srivastava, and Vipin Kumar. A comparative study of anomaly detection schemes in network intrusion detection. In SIAM International Conference on Data Mining (2003), San Francisco, California, May 1-3 2003.
Google Scholar
Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. In Advances in Cryptology – CRYPTO 2000, pages 36–54. Springer-Verlag, August 20-24 2000.
Google Scholar
Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, 2002.
Article MATH MathSciNet Google Scholar
Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 427–438. ACM Press, 2000.
Google Scholar
Mark Shaneck, Yongdae Kim, and Vipin Kumar. Privacy preserving nearest neighbor search. In ICDM Workshops, pages 541–545. IEEE Computer Society, 2006.
Google Scholar
Dragos Trinca and Sanguthevar Rajasekaran. Towards a collusion-resistant algebraic multi-party protocol for privacy-preserving association rule mining in vertically partitioned data. In 3rd International Workshop on Information Assurance, April11–13 2007.
Google Scholar
Jaideep Vaidya and Chris Clifton. Privacy preserving association rule mining in vertically partitioned data. In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644, Edmonton, Alberta, Canada, July 23-26 2002.
Google Scholar
Jaideep Vaidya and Chris Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 206–215, Washington, DC, August 24-27 2003.
Google Scholar
Jaideep Vaidya and Chris Clifton. Privacy preserving naïve bayes classifier for vertically partitioned data. In 2004 SIAM International Conference on Data Mining, pages 522–526, Lake Buena Vista, Florida, April 22–24 2004.
Google Scholar
Jaideep Vaidya and Chris Clifton. Privacy-preserving outlier detection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), pages 233–240, Los Alamitos, CA, November 1 – 4 2004. IEEE Computer Society Press.
Chapter Google Scholar
Jaideep Vaidya and Chris Clifton. Privacy-preserving decision trees over vertically partitioned data. In The 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security, Storrs, Connecticut, August 7-10 2005. Springer.
Google Scholar
Jaideep Vaidya and Chris Clifton. Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4):593–622, November 2005.
Google Scholar
Rebecca Wright and Zhiqiang Yang. Privacy-preserving bayesian network structure computation on distributed heterogeneous data. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, August22-25 2004.
Google Scholar
Andrew C. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pages 162–167. IEEE, 1986.
Google Scholar
Sheng Zhong. Privacy-preserving algorithms for distributed mining of frequent itemsets. Information Sciences, 177(2):490–503, 2007.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

MSIS Department and CIMIC, Rutgers University, Clarion, PA, USA
Jaideep Vaidya

Authors

Jaideep Vaidya
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM Thomas J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA
Charu C. Aggarwal
Department of Computer Science, University of Illinois at Chicago, 854 South Morgan Street, 60607-7053, Chicago, IL, USA
Philip S. Yu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vaidya, J. (2008). A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_14

Download citation

DOI: https://doi.org/10.1007/978-0-387-70992-5_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics