Article:Crawler Indexing using Tree Structure and its Implementation

Deepika Sharma; Parul Gupta; Dr. A.K. Sharma

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Wirelessly Transmitting a Grayscale Image using Visible Light

November

2012

Development and Performance Evaluation of Mismatched Filter using Differential Evolution

May

2012

A Novel Prioritised Concealment and Flexible Macroblock Ordering Scheme for Video Transmission

Sep

2016

An Optimizing Technique based on Genetic Algorithm for Power Management in Heterogeneous Multi-Tier Web Clusters

April

2015

Reseach Article

Article:Crawler Indexing using Tree Structure and its Implementation

by Deepika Sharma, Parul Gupta, Dr. A.K. Sharma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 31 - Number 6

Year of Publication: 2011

Authors: Deepika Sharma, Parul Gupta, Dr. A.K. Sharma

10.5120/3830-5323

Deepika Sharma, Parul Gupta, Dr. A.K. Sharma . Article:Crawler Indexing using Tree Structure and its Implementation. International Journal of Computer Applications. 31, 6 ( October 2011), 34-39. DOI=10.5120/3830-5323

@article{ 10.5120/3830-5323,

author = { Deepika Sharma, Parul Gupta, Dr. A.K. Sharma },

title = { Article:Crawler Indexing using Tree Structure and its Implementation },

journal = { International Journal of Computer Applications },

issue_date = { October 2011 },

volume = { 31 },

number = { 6 },

month = { October },

year = { 2011 },

issn = { 0975-8887 },

pages = { 34-39 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume31/number6/3830-5323/ },

doi = { 10.5120/3830-5323 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:17:27.270068+05:30

%A Deepika Sharma

%A Parul Gupta

%A Dr. A.K. Sharma

%T Article:Crawler Indexing using Tree Structure and its Implementation

%J International Journal of Computer Applications

%@ 0975-8887

%V 31

%N 6

%P 34-39

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The plentiful content of the World-Wide Web is useful to millions. Information seekers use a search engine such as Google, Yahoo etc to begin their Web activity. Our aim is to make a search tool that is cost-effective, efficient, fast and user friendly. In response to a query, it should retrieve the most relevant information which has been stored into the database. It should also be portable, so that it can easily be deployed at any platform without any cost and inconvenience. Our goal is to make a Web Search Engine that will retrieve the best matched WebPages in the shortest possible time. This paper proposes an algorithm for crawler in which crawler crawls the WebPages recursively and stores the relevant data in the database. The algorithm uses the basic principles of tree structure while maintaining the crawled data by the crawler to be used by the search engine. The proposed work makes the searching on the web more efficient. It uses the tree/node structure in the database which filters the searched word more efficiently and gives faster results to the user. The paper has also implemented the crawler indexing with tree structure using HTML based Update File at Web Server’ while making the crawling and searching more efficient.

References

Changshang Zhou, Wei Ding, Na Yang, Double Indexing Mechanism of Search Engine based on Campus Net, Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06).
Fabrizio Silvestri, Raffaele Perego and Salvatore Orlando. Assigning Document Identifiers to Enhance Compressibility of Web Search Engines Indexes. In the proceedings of SAC, 2004.
Oren Zamir and Oren Etzioni. Web Document Clustering: A feasibility demonstration. In the proceedings of SIGIR, 1998.
A. Jain and R. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988
Berners-Lee, T., Hendler, J. and Lassila, O., “The Semantic Web,” Scientific American.284(5):35-43, 2001.
O. Zamir, O. Etzioni, O. Madanim, and R.M. Karp, “Fast andIntuitive Clustering of Web Documents,” Proc. Third Int’l Conf. Knowledge Discovery and Data Mining, pp. 287-290, Aug. 1997.
Wang Jicheng, Huang Yuan, Wu Gangshan and Zhang Fuyan, ‘Web Mining: Knowledge Discovery on the Web’ ,IEEE (1999).
Frawley, W., Piatetsky-Shapiro, G., and Matheus, C., Knowledge Discovery in Databases: An Overview. Ai Magazine, Vol. 13 (1992), pp.57-70.
Changshang Zhou, Wei Ding, Na Yang, Double Indexing Mechanism of Search Engine based on Campus Net, Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06)
Quan, T. T., Hui, S. C., Fong, A. C. M., and Cao, T. H. (2004). Automatic generation of ontology for scholarly semantic Web. In: Lecture Notes in Computer Science. Vol. 3298. (pp. 726–740).

Index Terms

Computer Science

Information Sciences

Keywords

Crawler Indexing Tree Structure World-Wide Web