International IT Summit Confluence 2013-The Next Generation Information Technology Summit |
Foundation of Computer Science USA |
CONFLUENCE2013 - Number 2 |
January 2014 |
Authors: Shefali Singhal, Neha Garg |
237308e3-c06c-4a10-bf8c-b01d176b66f6 |
Shefali Singhal, Neha Garg . Hybrid Web-page Segmentation and Block Extraction for Small Screen Terminals. International IT Summit Confluence 2013-The Next Generation Information Technology Summit. CONFLUENCE2013, 2 (January 2014), 12-15.
Web page representation is a topic of concern for small screen devices, like, mobile, palm, etc. In a web-page, bulk of irrelevant data including advertisements and other noisy information's create access inconvenience. Web page segmentation is a technique which resolves this problem by logically dividing a web page into segments. These segments can be created by using DOM (Document Object Model) and VIPS (Visual Page Segmentation) techniques. In this paper, a hybrid method of web page segmentation has been designed using combination of DOM method and VIPS algorithm for developing segments from a web page. Here both the structural and visual aspects of a web page to create a segment have been considered. A segment is such a basic unit of web page which cannot be further divided. This is done by processing a web page through a BLOCK CREATION ALGORITHM which is discussed further.