CFP last date
20 December 2024
Reseach Article

Top K List Extraction from Web Pages

by Priyanka Deshmane, Pramod Patil, Abha Pathak
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 149 - Number 5
Year of Publication: 2016
Authors: Priyanka Deshmane, Pramod Patil, Abha Pathak
10.5120/ijca2016911394

Priyanka Deshmane, Pramod Patil, Abha Pathak . Top K List Extraction from Web Pages. International Journal of Computer Applications. 149, 5 ( Sep 2016), 1-5. DOI=10.5120/ijca2016911394

@article{ 10.5120/ijca2016911394,
author = { Priyanka Deshmane, Pramod Patil, Abha Pathak },
title = { Top K List Extraction from Web Pages },
journal = { International Journal of Computer Applications },
issue_date = { Sep 2016 },
volume = { 149 },
number = { 5 },
month = { Sep },
year = { 2016 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume149/number5/25990-2016911394/ },
doi = { 10.5120/ijca2016911394 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:53:52.650826+05:30
%A Priyanka Deshmane
%A Pramod Patil
%A Abha Pathak
%T Top K List Extraction from Web Pages
%J International Journal of Computer Applications
%@ 0975-8887
%V 149
%N 5
%P 1-5
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In present days finding relevant and desired information in less time is very crucial however problem is that very small proportion data on internet is interpretable and meaningful and need lot of time to extract. The paper provides solution to problem by extracting information from top-k websites, which consist top k instances of a subject. For Examples”top 5 football teams in the world”. In comparison with other structured information like web tables top-k lists contains high quality information . It can be use to enhance open-domain knowledge base [which can support search or fact answering applications]. Proposed system in paper extract the top k list by using title classifier, parser ,candidate picker , ranker, content processor .

References
  1. Zhixian Zhang, Kenny Q. Zhu, Haixun Wang Hong song Li , “Automatic Extraction of Top-k Lists from the Web” IEEE ,ICDE Conference, 2013, 978-1-4673-4910-9.
  2. Z. Zhang, K. Q. Zhu, and H. Wang, “A system for extracting top-k lists from the web” in KDD, 2012.
  3. W. Wu, H. Li, H. Wang, and K. Q. Zhu, ”Probase: A probabilistic taxonomy for text understanding” in SIGMOD, 2012.
  4. X. Cao, G. Cong, B. Cui, C. Jensen, and Q. Yuan, ” Approaches to exploring category information for question retrieval in community question-answer archives,” TOIS, vol. 30, no. 2, p. 7,2012.
  5. J. Wang, H. Wang, Z. Wang, and K. Q. Zhu, ”Understanding tables on the web,” in ER, 2012, pp. 141155.
  6. F. Fumarola, T. Weninger, R. Barber, D. Malerba, and J. Han, ” Extracting general lists from web documents: A hybrid approach,” in IEA/AIE (1), 2011, pp. 285294.
  7. Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen, ”Short text conceptualization using a probabilistic knowledge base,” in IJCAI, 2011.
  8. A. Angel, S. Chaudhuri, G. Das, and N. Koudas, ”Ranking objects based on relationships and fixed associations,” in EDBT, 2009, pp. 910921.
  9. G. Miao, J. Tatemura, W.-P. Hsiung, A. Sawires, and L. E. Moser,” Extracting data records from the web using tag path clustering,” in WWW, 2009, pp. 981990.
  10. EK. Fisher, D. Walker, K. Q. Zhu, and P. White,”From dirt to shovels: Fully automatic tools generation from ad hoc data,” in ACM POPL,2008.
  11. N. Bansal, S. Guha, and N. Koudas, ”Ad-hoc aggregations of ranked lists in the presence of hierarchies,” in SIGMOD, 2008, pp. 6778.
  12. M. J. Cafarella, E. Wu, A. Halevy, Y. Zhang, and D. Z. Wang,”Web tables: Exploring the power of tables on the web,” in VLDB, 2008.
  13. W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krupl, and B. Pollak, ”Towards domain-independent information extraction from web tables,” in WWW. ACM Press, 2007, pp. 7180.
  14. K. Chakrabarti, V. Ganti, J. Han, and D. Xin, ”Ranking objects based on relationships,” in SIGMOD, 2006, pp. 371382.
  15. B. Liu, R. L. Grossman, and Y. Zhai, ”Mining data records in web pages,” in KDD, 2003, pp. 601606.
  16. P Deshmane , P.Patil, Abha Pathak “Survey on web mining techniques for Extraction of top k list”IJMTER 2015
Index Terms

Computer Science
Information Sciences

Keywords

Data extraction Structured information top k list top k web pages web parser