CFP last date
20 December 2024
Reseach Article

Post Anonymization Techniques in Privacy Preserved Data Mining

by A. K. Ilavarasi, D. Jeniffa, B. Sathiyabhama
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 46 - Number 24
Year of Publication: 2012
Authors: A. K. Ilavarasi, D. Jeniffa, B. Sathiyabhama
10.5120/7124-9735

A. K. Ilavarasi, D. Jeniffa, B. Sathiyabhama . Post Anonymization Techniques in Privacy Preserved Data Mining. International Journal of Computer Applications. 46, 24 ( May 2012), 23-28. DOI=10.5120/7124-9735

@article{ 10.5120/7124-9735,
author = { A. K. Ilavarasi, D. Jeniffa, B. Sathiyabhama },
title = { Post Anonymization Techniques in Privacy Preserved Data Mining },
journal = { International Journal of Computer Applications },
issue_date = { May 2012 },
volume = { 46 },
number = { 24 },
month = { May },
year = { 2012 },
issn = { 0975-8887 },
pages = { 23-28 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume46/number24/7124-9735/ },
doi = { 10.5120/7124-9735 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:40:30.732365+05:30
%A A. K. Ilavarasi
%A D. Jeniffa
%A B. Sathiyabhama
%T Post Anonymization Techniques in Privacy Preserved Data Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 46
%N 24
%P 23-28
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Privacy preserving data mining deals with the effectiveness of preserving privacy and utility of the data. Privacy becomes a key concern when the medical data is published for research purposes. Anonymization techniques can be used to transform the dataset into less specific values before publishing to overcome the security breaches. Privacy preservation may reduce the utility value of data. Classification helps to improve the utility of the anonymized data. We propose a model in which a multi-decision tree classifier is built on the anonymized dataset to improve the utility. Multi-decision tree classifier is constituted by Improved ID3 based ADABOOST classifier. The proposed approach is different as the decision tree built is multi-decision tree and as it is constructed on the anonymized dataset. It is proved to be better than the pure decision tree classifier as the multi-decision tree classifier has accuracy better than and training duration shorter than the normal ID3 based ADABOOST classifier.

References
  1. Duncan, G. T. , and Lambert, D. 1986. Disclosure-limited data dissemination. In: Journal of the American Statistical Association, pp. 10-28.
  2. Sweeney, L. 1997. Guaranteeing anonymity when sharing medical data, the Datafly system. In: Proceedings of the American Medical Informatics Association, Annual Symposium.
  3. Rubner, Y. , Tomasi, C. , and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. In: International Journal of Computer Vision.
  4. Bickel, P. , and Doksum, K. 2000. Mathematical Statistics-Basic Ideas and Selected Topics, second edition, Prentice Hall.
  5. Sweeney, L. 2002. Achieving k-anonymity privacy protection using generalization and suppression. In: International Journal of Uncertainty, Fuzziness, and knowledge-based Systems, vol. 10, pp. 571-588, IEEE.
  6. Zhang, J. , Kang, D. K. , Silvescu, A. , and Honavar, V. 2006. Learning accurate and concise naive bayes classifiers from attribute value taxonomies and data. In: Knowl. Inf. Syst. , vol. 9, pp. 157-179.
  7. Machanavajjhala, A. , Gehrke, J. , Kifer, D. , and Venkitasubramaniam, M. 2006. l-diversity: Privacy beyond k-anonymity. In: ICDE '06, Atlanta, GA,USA.
  8. Ninghui, Li. , Tiancheng, Li. , and Venkatasubramanian, S. 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE '07, pp. 106-115, Istanbul, Turkey.
  9. Hatami, N. , and Ebrahimpour, R. 2007. Combining multiple classifiers:diversity with boosting and combining by stacking [J]. International Journal of Computer Science and Network Security, pp. 127-131.
  10. Yan ZHU. , and Lin PENG. 2007. Study on K-anonymity Models of Sharing Medical Information. Beijing, China, IEEE.
  11. Ali Inan. , Murat Kantarcioglu. , and Elisa Bertino. 2009. Using Anonymized Data for Classification. In: International Conference on Data Engineering , IEEE.
  12. Duan Xiaochen. , and Hong Xue. 2011. Multi-Decision-Tree Classifier in Master Data Management System. IEEE.
  13. Jiuyong Li, Jixue Liu, Muzammil Baig and Raymond Chi-Wing Wong. 2011. Information based data anonymization for classification utility. In: Data & Knowledge Engineering 70 (2011) 1030–1045, Elsevier publication.
Index Terms

Computer Science
Information Sciences

Keywords

Data Privacy Anonymization Classification