Top K List Extraction from Web Pages

Priyanka Deshmane; Pramod Patil; Abha Pathak

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Top K List Extraction from Web Pages

by Priyanka Deshmane, Pramod Patil, Abha Pathak

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 149 - Number 5

Year of Publication: 2016

Authors: Priyanka Deshmane, Pramod Patil, Abha Pathak

10.5120/ijca2016911394

Priyanka Deshmane, Pramod Patil, Abha Pathak . Top K List Extraction from Web Pages. International Journal of Computer Applications. 149, 5 ( Sep 2016), 1-5. DOI=10.5120/ijca2016911394

@article{ 10.5120/ijca2016911394,

author = { Priyanka Deshmane, Pramod Patil, Abha Pathak },

title = { Top K List Extraction from Web Pages },

journal = { International Journal of Computer Applications },

issue_date = { Sep 2016 },

volume = { 149 },

number = { 5 },

month = { Sep },

year = { 2016 },

issn = { 0975-8887 },

pages = { 1-5 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume149/number5/25990-2016911394/ },

doi = { 10.5120/ijca2016911394 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:53:52.650826+05:30

%A Priyanka Deshmane

%A Pramod Patil

%A Abha Pathak

%T Top K List Extraction from Web Pages

%J International Journal of Computer Applications

%@ 0975-8887

%V 149

%N 5

%P 1-5

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In present days finding relevant and desired information in less time is very crucial however problem is that very small proportion data on internet is interpretable and meaningful and need lot of time to extract. The paper provides solution to problem by extracting information from top-k websites, which consist top k instances of a subject. For Examples”top 5 football teams in the world”. In comparison with other structured information like web tables top-k lists contains high quality information . It can be use to enhance open-domain knowledge base [which can support search or fact answering applications]. Proposed system in paper extract the top k list by using title classifier, parser ,candidate picker , ranker, content processor .

References

Zhixian Zhang, Kenny Q. Zhu, Haixun Wang Hong song Li , “Automatic Extraction of Top-k Lists from the Web” IEEE ,ICDE Conference, 2013, 978-1-4673-4910-9.
Z. Zhang, K. Q. Zhu, and H. Wang, “A system for extracting top-k lists from the web” in KDD, 2012.
W. Wu, H. Li, H. Wang, and K. Q. Zhu, ”Probase: A probabilistic taxonomy for text understanding” in SIGMOD, 2012.
X. Cao, G. Cong, B. Cui, C. Jensen, and Q. Yuan, ” Approaches to exploring category information for question retrieval in community question-answer archives,” TOIS, vol. 30, no. 2, p. 7,2012.
J. Wang, H. Wang, Z. Wang, and K. Q. Zhu, ”Understanding tables on the web,” in ER, 2012, pp. 141155.
F. Fumarola, T. Weninger, R. Barber, D. Malerba, and J. Han, ” Extracting general lists from web documents: A hybrid approach,” in IEA/AIE (1), 2011, pp. 285294.
Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen, ”Short text conceptualization using a probabilistic knowledge base,” in IJCAI, 2011.
A. Angel, S. Chaudhuri, G. Das, and N. Koudas, ”Ranking objects based on relationships and fixed associations,” in EDBT, 2009, pp. 910921.
G. Miao, J. Tatemura, W.-P. Hsiung, A. Sawires, and L. E. Moser,” Extracting data records from the web using tag path clustering,” in WWW, 2009, pp. 981990.
EK. Fisher, D. Walker, K. Q. Zhu, and P. White,”From dirt to shovels: Fully automatic tools generation from ad hoc data,” in ACM POPL,2008.
N. Bansal, S. Guha, and N. Koudas, ”Ad-hoc aggregations of ranked lists in the presence of hierarchies,” in SIGMOD, 2008, pp. 6778.
M. J. Cafarella, E. Wu, A. Halevy, Y. Zhang, and D. Z. Wang,”Web tables: Exploring the power of tables on the web,” in VLDB, 2008.
W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krupl, and B. Pollak, ”Towards domain-independent information extraction from web tables,” in WWW. ACM Press, 2007, pp. 7180.
K. Chakrabarti, V. Ganti, J. Han, and D. Xin, ”Ranking objects based on relationships,” in SIGMOD, 2006, pp. 371382.
B. Liu, R. L. Grossman, and Y. Zhai, ”Mining data records in web pages,” in KDD, 2003, pp. 601606.
P Deshmane , P.Patil, Abha Pathak “Survey on web mining techniques for Extraction of top k list”IJMTER 2015

Index Terms

Computer Science

Information Sciences

Keywords

Data extraction Structured information top k list top k web pages web parser