CFP last date
20 December 2024
Reseach Article

A Fast and Efficient Privacy Preserving Data Mining Over Vertically Partitioned Data

by P. S. Annakkodi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 76 - Number 2
Year of Publication: 2013
Authors: P. S. Annakkodi
10.5120/13218-0614

P. S. Annakkodi . A Fast and Efficient Privacy Preserving Data Mining Over Vertically Partitioned Data. International Journal of Computer Applications. 76, 2 ( August 2013), 18-22. DOI=10.5120/13218-0614

@article{ 10.5120/13218-0614,
author = { P. S. Annakkodi },
title = { A Fast and Efficient Privacy Preserving Data Mining Over Vertically Partitioned Data },
journal = { International Journal of Computer Applications },
issue_date = { August 2013 },
volume = { 76 },
number = { 2 },
month = { August },
year = { 2013 },
issn = { 0975-8887 },
pages = { 18-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume76/number2/13218-0614/ },
doi = { 10.5120/13218-0614 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:45:20.633746+05:30
%A P. S. Annakkodi
%T A Fast and Efficient Privacy Preserving Data Mining Over Vertically Partitioned Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 76
%N 2
%P 18-22
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The goal of data mining is to extract or "mine" knowledge from large amounts of data. However, data is often collected by several different sites. Privacy, legal and commercial concerns restrict centralized access to this data. Theoretical results from the area of secure multiparty computation in cryptography prove that assuming the existence of trapdoor permutations, one may provide secure protocols for any twoparty computation as well as for any multiparty computation with honest majority. However, the general methods are far too inefficient and impractical for computing complex functions on inputs consisting of large sets of data. What remains open is to come up with a set of techniques to achieve this efficiently within a quantifiable security framework. The distributed data model considered is the heterogeneous database scenario with different features of the same set of data being collected by different sites. This paper argues that it is indeed possible to have efficient and practical techniques for useful privacy-preserving mining of knowledge from large amounts of data. The dissertation presents several privacy preserving data mining algorithms operating over vertically partitioned data. The set of underlying techniques solving independent sub-problems are also presented. Together, these enable the secure "mining" of knowledge.

References
  1. Nabil R. Adam and John C. Wortmann. Security-control methods for statistical databases: A comparative study. ACM Computing Surveys, 21(4):515–556, December 1989.
  2. Dakshi Agrawal and Charu C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the Twenti-eth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 247–255, Santa Barbara, CA, May 21-23 2001. ACM.
  3. Rakesh Agrawal, Alexandre Evfimievski, and Ramakrishnan Srikant. Information sharing across private databases. In Proceedings of ACM SIGMOD Inter-national Conference on Management of Data, San Diego, CA, June 9-12 2003.
  4. Rakesh Agrawal and Ramakrishnan Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD Conference Management of Data, pages 439–450, Dallas, TX, May 14-19 2000. ACM.
  5. M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, and V. Verykios. Disclosure limitation of sensitive rules. In Knowledge and Data Engineering Exchange Workshop (KDEX'99), pages 25–32, Chicago, IL, November 8 1999.
  6. Mikhail J. Atallah and Wenliang Du. Secure multi-party computational geometry. In Seventh International Workshop on Algorithms and Data Structures (WADS 2001), Providence, RI, August 8-10 2001.
  7. Philip Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD paper, Department of Computer Science, Columbia University, New York, NY, 1996.
  8. Philip Chan. On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information Systems, 8:5–28, 1997.
  9. Wenliang Du. A Study of Several Specific Secure Two-party Computation Prob- lems. PhD paper, Purdue University, West Lafayette, Indiana, 2001.
  10. Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. Privacy preserving mining of association rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 217–228, Edmonton, Alberta, Canada, July 23-26 2002.
  11. Ioannis Ioannidis, Ananth Grama, and Mikhail Atallah. A secure protocol for computing dot-products in clustered and distributed environments. In The 2002 International Conference on Parallel Processing, Vancouver, British Columbia, August 18-21 2002.
  12. Richard A. Moore, Jr. Controlled data-swapping techniques for masking public use microdata sets. Statistical Research Division Report Series RR 96-04, U. S. Bureau of the Census, Washington, DC. , 1996.
  13. Stanley R. M. Oliveira and Osmar R. Zaiane. Foundations for an access control model for privacy preservation in multi-relational association rule mining. In Chris Clifton and Vladimir Estivill-Castro, editors, IEEE ICDM Workshop on Privacy, Security and Data Mining, volume 14 of Conferences in Research and Practice in Information Technology, pages 19–26, Maebashi City, Japan, 2002. ACS.
  14. Stanley R. M. Oliveira and Osmar R. Zaiane. Privacy preserving frequent itemset mining. In Chris Clifton and Vladimir Estivill-Castro, editors, IEEE ICDM Workshop on Privacy, Security and Data Mining, volume 14 of Conferences in Research and Practice in Information Technology, pages 43–54, Maebashi City, Japan, 2002. ACS.
  15. Stanley R. M. Oliveira and Osmar R. Za¨?ane. Privacy preserving clustering by data transformation. In Proceedings of the Eighteenth Brazilian Symposium on Databases, pages 304–318, Manaus, Amazonas, Brazil, October 6-10 2003.
  16. Stanley R. M. Oliveira and Osmar R. Za¨?ane. Protecting sensitive knowledge by data sanitization. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM'03), Melbourne, FL, November 19-22 2003.
  17. Huseyin Polat andWenliang Du. Privacy-preserving collaborative filtering using randomized perturbation techniques. In Proceedings of the Third IEEE Inter- national Conference on Data Mining (ICDM'03), pages 625–628, Melbourne, FL, November 19-22 2003.
  18. Andreas Prodromidis, Philip Chan, and Salvatore Stolfo. Advances in Dis- tributed and Parallel Knowledge Discovery, chapter 3: Meta-learning in distributed data mining systems: Issues and approaches. AAAI/MIT Press, September 2000.
  19. Shariq J. Rizvi and Jayant R. Haritsa. Maintaining data privacy in association rule mining. In Proceedings of Twenty-eighth International Conference on Very Large Data Bases, pages 682–693, Hong Kong, August 20-23 2002. VLDB.
  20. Y¨ucel Saygin, Vassilios S. Verykios, and Chris Clifton. Using unknowns to prevent discovery of association rules. SIGMOD Record, 30(4):45–54, December 2001.
  21. Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October 1999.
Index Terms

Computer Science
Information Sciences

Keywords

Vertical partitioning Distributed Data Mining (DDM)