CFP last date
20 January 2025
Reseach Article

K-Anonymization using Multidimensional Suppression for Data De-identification

by Snehal M. Nargundi, Rashmi Phalnikar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 60 - Number 11
Year of Publication: 2012
Authors: Snehal M. Nargundi, Rashmi Phalnikar
10.5120/9740-4291

Snehal M. Nargundi, Rashmi Phalnikar . K-Anonymization using Multidimensional Suppression for Data De-identification. International Journal of Computer Applications. 60, 11 ( December 2012), 38-42. DOI=10.5120/9740-4291

@article{ 10.5120/9740-4291,
author = { Snehal M. Nargundi, Rashmi Phalnikar },
title = { K-Anonymization using Multidimensional Suppression for Data De-identification },
journal = { International Journal of Computer Applications },
issue_date = { December 2012 },
volume = { 60 },
number = { 11 },
month = { December },
year = { 2012 },
issn = { 0975-8887 },
pages = { 38-42 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume60/number11/9740-4291/ },
doi = { 10.5120/9740-4291 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:06:19.729897+05:30
%A Snehal M. Nargundi
%A Rashmi Phalnikar
%T K-Anonymization using Multidimensional Suppression for Data De-identification
%J International Journal of Computer Applications
%@ 0975-8887
%V 60
%N 11
%P 38-42
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

As searching methods have advanced the increased risk of privacy disclosure makes it important to protect privacy of user during data publishing. Many of the algorithms used for the data de-identification are not efficient because resulted dataset can easily linked with the public database and it reveals the users identity. One of the method uses for protecting the privacy of user is to apply anonymization algorithms. TDS and TDR using generalization of method to anonymized the dataset. Major drawback as these algorithm is they requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. Therefore, in this paper we propose new approach which will makes use of suppression based k-anonymization method to allow data publisher to de-identify datasets and in this method only certain attributes from record are suppressed based on values other attributes. As suppression method is used in algorithm, it does not required manually created taxonomy tree of quasi-identifiers. We applied this algorithm on 3 different data sets to evaluate its accuracy as compared to other k-anonymity generalization algorithms. It is found that predicative performance of this algorithm is better than existing generalization methods. This method is expected to provide privacy and accuracy measures to data publishers.

References
  1. L. Sweeney, "k-Anonymity: A Model for Protecting Privacy", International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, vol. 10, pp. 557-570, 2002.
  2. B. C. M. Fung, K. Wang, and P. S. Yu, "Top-Down Specialization for Information and Privacy Preservation," Proc. 21st IEEE Int'l Conf. Data Eng. (ICDE '05), pp. 205-216, Apr. 2005.
  3. B. C. M. Fung, K. Wang, and P. S. Yu, "Anonymizing Classification Data for Privacy Preservation," IEEE Trans. Knowledge and Data Eng. , vol. 19, no. 5, pp. 711-725, May 2007.
  4. A. Friedman, R. Wolff, and A. Schuster, "Providing k-Anonymity in Data Mining," Int'l J. Very Large Data Bases, vol. 17, no. 4, pp. 789-804, 2008.
  5. S. V. Iyengar, "Transforming Data to Satisfy Privacy Constraints," Proc. Eighth ACM SIGKDD, pp. 279-288, 2002.
  6. L. Tiancheng and I. Ninghui, "Optimal K-Anonymity with Flexible Generalization Schemes through Bottom-Up Searching," Proc. Sixth IEEE Int'l Conf. Data Mining Workshops, pp. 518-523, 2006.
  7. A. Asuncion and D. J. Newman, "UCI Machine Learning Repository," School of Information and Computer Science, Univ. of California, http://mlearn. ics. uci. edu/MLRepository. html, 2007.
  8. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. (2005). Incognito: efficient full-domain K-anonymity. In Proceedings of the 2005 ACMSIGMOD international conference on Management of data(SIGMOD '05). ACM, New York, NY, USA, pp. 49-60.
  9. R. J. Bayardo and R. Agrawal. (2005). Data Privacy through Optimal k-Anonymization. In Proceedings of the 21st International Conference on Data Engineering (ICDE '05). IEEE Computer Society, Washington, DC, USA, pp. 217-228.
  10. P. Samarati, "Protecting Respondents' Identities in Microdata Release," IEEE Trans. Knowledge and Data Eng. , vol. 13, no. 6, pp. 1010-1027, Nov. /Dec. 2001.
  11. E. Bertino, B. C. Ooi, Y. Yang, and R. H. Deng, "Privacy and Ownership Preserving of Outsourced Medical Data," Proc. Int'l Conf. Data Eng. , vol. 21, pp. 521-532, 2005.
  12. G. Aggarwal, A. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu, "Approximation Algorithms for k-Anonymity," J. Privacy Technology, 2005.
  13. A. Meyerson and R. Williams, "On the Complexity of Optimal k-Anonymity," Proc. 23rd ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems, pp. 223-228, 2004.
  14. Z. Yang, S. Zhong, and R. N. Wright, "Privacy-Preserving Classification of Customer Data without Loss of Accuracy," Proc. Fifth Int'l Conf. Data Mining, 2005.
  15. S. Chawla, C. Dwork, F. McSherry, A. Smith, and H. Wee, "Toward Privacy in Public Databases," Proc. Theory of Cryptography Conf. , pp. 363-385, 2005.
  16. L. Sweeney, "Datafly: A System for Providing Anonymity in Medical Data," Proc. IFIP TC11 WG11. 3 11th Int'l Conf. Database Security XI: Status and Prospects, pp. 356-381, 1997.
  17. J. R. Quinlan, C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993.
  18. I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools. Morgan Kaufmann, 2005.
  19. S. Kisilevich, L. Rokach, Y. Elovici, B. Shapira, Efficient multidimensional suppression for k-anonymity, IEEE Transaction on Knowledge and Data Engineering 22 (3) (2010) 334–347
  20. K. LeFevre, D. J. DeWitt, R. Ramakrishnan, Mondrian multidimensional k-anonymity, International Conference on Data Engineering (ICDE 06), IEEE Computer Society, 2006, p. 25.
Index Terms

Computer Science
Information Sciences

Keywords

Privacy Preservation Data Mining Data De-identification PPDM k-Anonymization Suppression