Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree

Khaing Wah Wah Linn

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

Multi-Band RLS Estimation with Rank Two Updates: Application to Short-Term Temperature Forecast

Alexander Stotsky

Random Articles

A Simplified Equivalent Circuit Model of MEMS Electrostatic Actuator

Feb

2017

Image Tampering Detection and Repairing

January

2014

Smart Precision based Agriculture using Sensors

Jul

2016

Face Detection and Sex Identification from Color Images using AdaBoost with SVM based Component Classifier

August

2013

Reseach Article

Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree

by Khaing Wah Wah Linn

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 108 - Number 17

Year of Publication: 2014

Authors: Khaing Wah Wah Linn

10.5120/19006-0547

Khaing Wah Wah Linn . Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree. International Journal of Computer Applications. 108, 17 ( December 2014), 34-37. DOI=10.5120/19006-0547

@article{ 10.5120/19006-0547,

author = { Khaing Wah Wah Linn },

title = { Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree },

journal = { International Journal of Computer Applications },

issue_date = { December 2014 },

volume = { 108 },

number = { 17 },

month = { December },

year = { 2014 },

issn = { 0975-8887 },

pages = { 34-37 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume108/number17/19006-0547/ },

doi = { 10.5120/19006-0547 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:43:37.223874+05:30

%A Khaing Wah Wah Linn

%T Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree

%J International Journal of Computer Applications

%@ 0975-8887

%V 108

%N 17

%P 34-37

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The world wide (www) serves a huge, widely distributed global information services. A huge amount of data have been accumulated and stored on the web. The information on Web is usually presented via Hypertext Markup Language (HTML) to make its perception easier for humans. Web pages usually contain various contents, which are relevant or irrelevant to the main topic. Irrelevant contents are called noise. A web page usually contains the number of noise which is not related to the main information of the page such as navigation bar, advertisements, and related articles and so on. Noise on the web pages tends to problem mining the main content of these pages. This paper is proposed wed page segmentation using Gomory-Hu tree based Vision-based Page Segmentation (VIPS) algorithm.

References

Cai, D. ,Yu, S. , Wen, J. R. , Ma, W. Y. , "VIPS: A vision-based segmentation algorithm". 2003.
Elgin Akpinar and Yeliz Yesilada, "Vision Based Page Segmentation: Extended and Improved Alorithm", Middle East Technical University, Ankara, Turkey.
Deng C. , Shipeng Y. , Ji-Rong W. , Wei-Ying M. , "Extraction Content Structure for Web Pages based on Visual Representation", Microsoft Research Asia, China.
Brown, L. D. , Hua, H. , and Gao, C. 2003. A widget framework for augmented interaction in SCAPE.
Amit Chauhan, Himanshu Uniyal, Dr. Bhasker Pant, "Cleaining Web Pages for Relevant Text Extraction and Text Categorization", Graphic Era University, India.
Deng C. , Shipeng Y. , Ji-Rong W. , Wei-Ying M. , "Block-based Web Search", Microsoft Research Asia, China.
Swe Swe Nyein, "Mining Contents in Web Page Using Cosine Similarity", University of Computer Studies, Yangon, Myanmar.
Xinyue Liu, 2011 "Segmenting Webpage with Gomory-Hu Tree Based Clustering", Dalian University of Technology, Dalian, China.
Han Fengjiao, Zhou Zhurong, 2012, "Semantics-based Extraction of Webpage Main Text", Chongqing.
Aihua Zhang, Jiwu Jing, Le Kang, Lingchen Zhang, "Precise web page segmentation based on semantic block headers detection", University of science and technology, China.
Chaw Su Win, "Informative Content Extraction By using Eifce", IJSTR, 2013

Index Terms

Computer Science

Information Sciences

Keywords

Web Page Segmentation Vision-based Page Segmentation Gomory-Hu tree