National Conference on Advances in Computing |
Foundation of Computer Science USA |
NCAC2015 - Number 1 |
December 2015 |
Authors: Sayali Khodade, Roshani Ade |
87f94a2f-ab4f-44b9-9ea8-5310553fdd4c |
Sayali Khodade, Roshani Ade . Trinity for Web Data Extraction using Efficient Algorithm. National Conference on Advances in Computing. NCAC2015, 1 (December 2015), 18-22.
Now a days there are increasing number of users on the internet. The internet is having a huge collection of web data which is very useful for the users. Web data extractors are used to crawl the data from web documents. The planned approach which operates on two or more web records at once, which is created at same server-side template and takes in a regular expression that models it and can later be used to retrieve information from same records. The template introduces some shared patterns that do not provide any relevant data and can thus be disregarded. The technique gives better results for multiword queries comparatively other existing techniques and input errors do not have any negative impact on its effectiveness.