International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 50 - Number 8 |
Year of Publication: 2012 |
Authors: Manpreet Singh Sehgal, Anuradha and |
10.5120/7791-0897 |
Manpreet Singh Sehgal, Anuradha and . HWPDE: Novel Approach for Data Extraction from Structured Web Pages. International Journal of Computer Applications. 50, 8 ( July 2012), 22-27. DOI=10.5120/7791-0897
Diving into the World Wide Web for the purpose of fetching precious stones (relevant information) is a tedious task under the limitations of current diving equipments (Current Browsers). While a lot of work is being carried out to improve the quality of diving equipments, a related area of research is to devise a novel approach for mining. This paper describes a novel approach to extract the web data from the hidden websites so that it can be used as a free service to a user for a better and improved experience of searching relevant data. Through the proposed method, relevant data (Information) contained in the web pages of hidden websites is extracted by the crawler and stored in the local database so as to build a large repository of structured and indexed and ultimately relevant data. Such kind of extracted data has a potential to optimally satisfy the relevant Information starving end user.