International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 110 - Number 3 |
Year of Publication: 2015 |
Authors: Hassan F. Eldirdiery, A. H. Ahmed |
10.5120/19297-0734 |
Hassan F. Eldirdiery, A. H. Ahmed . Web Document Segmentation for Better Extraction of Information: A Review. International Journal of Computer Applications. 110, 3 ( January 2015), 24-28. DOI=10.5120/19297-0734
This paper reviews the problem of web page segmentation. According to the recent studies, there exist different approaches used to segment the web page into multiple blocks. Segmentation of web document is an essential step for many applications, such as text classifications, clustering, extraction of information and searching. The study provided full description for each approach and showed its contribution to the work area of research. Also the paper discusses the variance between these approaches, explaining the benefits and limitations of each one. In addition to that it explores most of the effective algorithms those based on these approaches and explains the application area of each algorithm.