International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 116 - Number 7 |
Year of Publication: 2015 |
Authors: Anand Ratna, Divya, Akshay Sawhney |
10.5120/20351-2540 |
Anand Ratna, Divya, Akshay Sawhney . Focused Crawler based on Efficient Page Rank Algorithm. International Journal of Computer Applications. 116, 7 ( April 2015), 37-40. DOI=10.5120/20351-2540
The size of the WWW is increasing rapidly and its nature is dynamic, building an efficient search mechanism is very necessary. A vast number of pages continually being added every day, so fetching information about a special-topic is gaining importance, which poses exceptional scaling challenges for general-purpose crawlers and search engines. This paper describes a web crawling approach based on best first search. Instead of collecting and indexing all available web documents to be able to answer all possible queries, a focused crawler choose the links that are likely to be most relevant for the crawl, and avoids irrelevant links of the document. This leads to significant savings in hardware as well as network resources and also helps keep the crawl more up-to-date. To accomplish such goal-directed crawling, select top most K relevant documents for a given query and then expand the most promising link chosen according to link score, to circumvent irrelevant regions of the web.