International Conference on Advanced Computer Technology |
Foundation of Computer Science USA |
ICACT - Number 1 |
August 2011 |
Authors: Kowsalya N, Dr. C. Chandrasekar |
a670f6fd-de5a-4bf3-bd8e-130ab6e60300 |
Kowsalya N, Dr. C. Chandrasekar . Implementation of MapReduce Algorithm and Nutch Distributed File System in Nutch. International Conference on Advanced Computer Technology. ICACT, 1 (August 2011), 6-11.
This paper provides an in-depth description of MapReduce algorithm and Nutch Distributed File System in Nutch web search engine. Nutch is an open-source Web search engine that can be used at global, local, and even personal scale. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from ten years ago.