International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 66 - Number 20 |
Year of Publication: 2013 |
Authors: Jeby K Luthiya, C. Umamaheswari |
10.5120/11198-6213 |
Jeby K Luthiya, C. Umamaheswari . Development of Replica Free Repositories using Particle Swarm Optimization Algorithm. International Journal of Computer Applications. 66, 20 ( March 2013), 8-13. DOI=10.5120/11198-6213
The increasing volume of information available in digital media becomes a challenging problem for data administrators. Usually built on data gathered from different sources, data repositories such as those used by digital libraries and e-commerce brokers present records with disparate schemata and structures. The increased volume even created redundant data also in the database. So a system or method is become immense to control the redundancy and duplication. In the proposed approach, a method that makes use of PSO (Particle Swarm Optimization) algorithm for generating the optimal similarity measure to decide whether the data is duplicate or not. PSO algorithm is used to generate the optimal similarity measure for the training datasets. Once the optimal similarity measure obtained, the deduplication of remaining datasets is done with the help of optimal similarity measure generated from the PSO algorithm.