CFP last date
20 January 2025
Reseach Article

A Model for Personalized Keyword Extraction from Web Pages using Segmentation

by K. S. Kuppusamy, G. Aghila
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 42 - Number 4
Year of Publication: 2012
Authors: K. S. Kuppusamy, G. Aghila
10.5120/5682-7720

K. S. Kuppusamy, G. Aghila . A Model for Personalized Keyword Extraction from Web Pages using Segmentation. International Journal of Computer Applications. 42, 4 ( March 2012), 21-26. DOI=10.5120/5682-7720

@article{ 10.5120/5682-7720,
author = { K. S. Kuppusamy, G. Aghila },
title = { A Model for Personalized Keyword Extraction from Web Pages using Segmentation },
journal = { International Journal of Computer Applications },
issue_date = { March 2012 },
volume = { 42 },
number = { 4 },
month = { March },
year = { 2012 },
issn = { 0975-8887 },
pages = { 21-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume42/number4/5682-7720/ },
doi = { 10.5120/5682-7720 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:30:54.339840+05:30
%A K. S. Kuppusamy
%A G. Aghila
%T A Model for Personalized Keyword Extraction from Web Pages using Segmentation
%J International Journal of Computer Applications
%@ 0975-8887
%V 42
%N 4
%P 21-26
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The World Wide Web caters to the needs of billions of users in heterogeneous groups. Each user accessing the World Wide Web might have his / her own specific interest and would expect the web to respond to the specific requirements. The process of making the web to react in a customized manner is achieved through personalization. This paper proposes a novel model for extracting keywords from a web page with personalization being incorporated into it. The keyword extraction problem is approached with the help of web page segmentation which facilitates in making the problem simpler and solving it effectively. The proposed model is implemented as a prototype and the experiments conducted on it empirically validate the model's efficiency.

References
  1. Salton, G. , Yang, C. S. , & Yu, C. T. (1975). A theory of term importance in automatic text analysis. Journal of the American society for Information Science, 26(1), 33–44.
  2. Matsuo, Y. , & Ishizuka, M. (2004). Keyword extraction from a single document using word co-ocuurrence statistical information. International Journal on Artificial Intelligence Tools, 13(1), 157–169.
  3. Jones, K. S. (1999). Information retrieval and artificial intelligence. Artificial Intelligence, 114(1-2), 257–281.
  4. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
  5. Ping-I Chen, Shi-Jen Lin, Automatic keyword prediction using Google similarity distance, Expert Systems with Applications, Volume 37, Issue 3, 15 March 2010, Pages 1928-1938, ISSN 0957-4174
  6. Ping-I Chen, Shi-Jen Lin, Word AdHoc Network: Using Google Core Distance to extract the most relevant information, Knowledge-Based Systems, Volume 24, Issue 3, April 2011, Pages 393-405, ISSN 0950-7051.
  7. Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. Block-based web search. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 456–463, New York, NY, USA, 2004. ACM
  8. Kaszkiel, M. and Zobel, J. , Effective Ranking with Arbitrary Passages, Journal of the American Society for Information Science, Vol. 52, No. 4, 2001, pp. 344-364.
  9. D. Cai, S. Yu, J. Wen, and W. -Y. Ma, VIPS: A vision-based page segmentation algorithm, Tech. Rep. MSR-TR-2003-79, 2003.
  10. Cao, Jiuxin , Mao, Bo and Luo, Junzhou, 'A segmentation method for web page analysis using shrinking and dividing', International Journal of Parallel, Emergent and Distributed Systems, 25: 2, 93 — 104, 2010.
  11. Kohlschütter, C. and Nejdl, W. A densitometric approach to web page segmentation. In Proceeding of the 17th ACM Conference on information and Knowledge Management (Napa Valley, California, USA, October 26 - 30, 2008). CIKM '08. ACM, New York, NY, 1173-1182, 2008.
  12. Deepayan Chakrabarti , Ravi Kumar , Kunal Punera, A graph-theoretic approach to webpage segmentation, Proceeding of the 17th international conference on World Wide Web, April 21-25, Beijing, China, 2008.
  13. Yahoo! Content Analysis Service http://developer. yahoo. com/search/content/V2/contentAnalysis. html
  14. K. S. Kuppusamy, G. Aghila, "Museum: Multidimensional Web page Segment Evaluation Model" Journal of Computing, Vol 3, Issue 3. pp. 24-27, ISSN 2151-9617
Index Terms

Computer Science
Information Sciences

Keywords

Keyword Extraction Web Page Segmentation Web Personalization