A Novel Web Crawler Algorithm on Query based Approach with Increases Efficiency

S S Vishwakarma; A Jain; A K Sachan

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

A Novel Web Crawler Algorithm on Query based Approach with Increases Efficiency

by S S Vishwakarma, A Jain, A K Sachan

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 46 - Number 1

Year of Publication: 2012

Authors: S S Vishwakarma, A Jain, A K Sachan

10.5120/6874-8983

S S Vishwakarma, A Jain, A K Sachan . A Novel Web Crawler Algorithm on Query based Approach with Increases Efficiency. International Journal of Computer Applications. 46, 1 ( May 2012), 34-37. DOI=10.5120/6874-8983

@article{ 10.5120/6874-8983,

author = { S S Vishwakarma, A Jain, A K Sachan },

title = { A Novel Web Crawler Algorithm on Query based Approach with Increases Efficiency },

journal = { International Journal of Computer Applications },

issue_date = { May 2012 },

volume = { 46 },

number = { 1 },

month = { May },

year = { 2012 },

issn = { 0975-8887 },

pages = { 34-37 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume46/number1/6874-8983/ },

doi = { 10.5120/6874-8983 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:38:39.600475+05:30

%A S S Vishwakarma

%A A Jain

%A A K Sachan

%T A Novel Web Crawler Algorithm on Query based Approach with Increases Efficiency

%J International Journal of Computer Applications

%@ 0975-8887

%V 46

%N 1

%P 34-37

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The Web crawler is a computer program that downloads data or information from World Wide Web for search engine. Web information is changed or updated rapidly without any information or notice. Web crawler searches the web for updated or new information. Approximate 40 % of web traffic is by web crawler. In this paper a web or network traffic solution has been proposed. The method of web crawling with filter is used. This approach is query based approach. The proposed approach solves the problem of revisiting web pages by crawler.

References

Yuan X, H Macgregor and J. Harms, "An efficient scheme to remove crawler traffic from the internet. " Proceedings of the 11th International Conference on Computer Communications and Networks, Oct 2002. 14-16, IEEE CS Press, (pp: 90-95).
Sun. Y, Council G. Isaac and Giles C. Lee, "The Ethicality of Web Crawlers", in the proceedings of 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Toronto Canada august 2010. (pp: 668-675)
Alpert, Jesse; Hajaj, Nissan (July 25, 2008). "We knew the web was big. . . " The Official Google Blog.
"Domain Counts & Internet Statistics". Name Intelligence. Retrieved May 17, 2009.
Alexandros Ntoulas, Junghoo Cho and Christopher Olston, "What's new on the web ? the evolution of the web from a search engine perspective" WWW2004, may 17-22, 2004, New York, USA, ACM 1-58113-844-X/04/0005.
Etyan Adar, Jaime Teevan, Susan T Durnais and Jonathan L Elsas, "The web changes everything: Understanding the dynamics of web content" WSDM 09, February 9-12-2009, Barcelona, Spam, ACM 978-1-60558-390-7.
Cambazoglu, B. B. ; Junqueira, F. ; Plachouras, V. ; Telloli, L. , "On the feasibility of geographically distributed web crawling. " (ISBN: 978-963-9799-28-8) In the proceedings of Third International ICST Conference on Scalable Information Systems, ICST, Vico Equense, Italy (2008).
Bal. S and Nath. R,"Filtering the web pages that are not modified at remote site without downloading using mobile crawler". Information Technology journal 9(2)2010 ISSN 1812-5638, Asian Network for Sciencetific information. (pp: 376-380)
Pahal N, Kumar S, Bhardwaj A and Chauhan N," Security Mobile Agent Based Crawler = (SMABC)"= International Journal of Computer Applications 1(14), February 2010. (pp: 5–11)
Thelwall. M and Stuart. D, "Web crawling ethics revisited: Cost, privacy and denial of service". Journal of the American Society for Information Science and Technology. 2006. Volume 57, Issue 13 November 2006. (pp: 1771 - 1779)
Shekhar mishra, anurag jain and A K Sachhan, "A Query based Approach to Reduce the Web Crawler Traffic using HTTP Get Request a Dynamic Web Page". International Journal of Computer Applications (0975 – 8887) Volume 14– No. 3, January 2011.
Shekhar mishra, anurag jain and A K Sachhan, "Smart approach to Reduce the Web Crawler Traffic of existing system using HTML based update file at web server". International Journal of Computer Applications 11(7), December 2010 (pp: 34-38).
"Web Crawler", From Wikipedia, http://en. wikipedia. org/wiki/Web_crawler
"World Wide Web", From Wikipedia, http://en. wikipedia. org/wiki/World_Wide_Web
"Hyper Text Transfer Protocol", http://en. wikipedia. org/wiki/hypertext_Transfer_Protocol

Index Terms

Computer Science

Information Sciences

Keywords

Web Search Engine Web Crawler Web Crawling Traffic Http Get Request