International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 62 - Number 4 |
Year of Publication: 2013 |
Authors: K. Deepa, R. Rangarajan, M. Senthamil Selvi |
10.5120/10068-4674 |
K. Deepa, R. Rangarajan, M. Senthamil Selvi . Automatic Threshold Selection using PSO for GA based Duplicate Record Detection. International Journal of Computer Applications. 62, 4 ( January 2013), 22-27. DOI=10.5120/10068-4674
Normally setting the threshold is an important issue in applications where the similarity functions are used and it relies more on human intervention. The proposed work addressed two issues : first to find the optimal equation using Genetic Algorithm (GA) and next it adopts an intelligence algorithm, Particle Swarm Optimization (PSO) to get the optimal threshold to detect the duplicate records more accurately and also it reduces human intervention. Restaurant and CORA data repository are used to analyze the proposed algorithm and the performance of the proposed algorithm is compared against marlin method and the genetic programming with the help of evaluation metrics.