International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 108 - Number 6 |
Year of Publication: 2014 |
Authors: Jayshree Ghorpade-aher, Roshan Bagdiya |
10.5120/18917-0245 |
Jayshree Ghorpade-aher, Roshan Bagdiya . A Review on Clustering Web data using PSO. International Journal of Computer Applications. 108, 6 ( December 2014), 31-36. DOI=10.5120/18917-0245
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Due to its wide distribution, openness and highly dynamic data, the resources on the web are greatly scattered and they have no unified management and structure. Near about 90 % web data is unstructured and needed to be structure as it greatly reduces the efficiency in using web information. Web text feature extraction and clustering are the main challenging tasks in web data mining, which requires an efficient clustering technique. Data mining tasks require fast and accurate partitioning of huge unstructured data which may come with a variety of dimensions and attribute. In our paper we are focusing on the different clustering techniques, helpful for web data clustering. For such novel approach we perform a literature survey and depicted an evolutionary bio-inspired Swarm Intelligence algorithm called Particle Swarm Optimization (PSO) for optimized clustering result. In order to preprocess input data for improving the accuracy and optimize keyword searching, stop word removal and stemming methods are used. PSO algorithm will greatly improve the efficiency of web texts processing, and such evolutionary clustering techniques are used for web text data clustering.