International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 97 - Number 9 |
Year of Publication: 2014 |
Authors: Vikas Thada, Vivek Jaglan |
10.5120/17038-7346 |
Vikas Thada, Vivek Jaglan . Design of Web Ranking Module using Genetic Algorithm. International Journal of Computer Applications. 97, 9 ( July 2014), 43-48. DOI=10.5120/17038-7346
Crawling is a process in which web search engines collect data from the web. Focused crawling is a special type of crawling process where crawler look for information related to a predefined topic[1]. In this paper a method for finding out the most relevant document among a set of documents for the given set of keyword is presented. Relevance checking is done with the help of Rogers-Tanimoto, MountFord and Baroni-Urbani/Buser similarity coefficients. The method uses genetic algorithm to show that the average similarity of documents to the query increases when Probability of mutation is taken as low and Probability of crossover is taken as high. The method does the performance analysis of different similarity coefficients on the same set of documents and applies ranking to the documents whose relevancy is highest among the three coefficients.