CFP last date
20 January 2025
Reseach Article

Link Analysis to discover relevant documents using Information Retrieval

by Hemangini S. Patel, Apurva A. Desai
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 178 - Number 10
Year of Publication: 2019
Authors: Hemangini S. Patel, Apurva A. Desai
10.5120/ijca2019918827

Hemangini S. Patel, Apurva A. Desai . Link Analysis to discover relevant documents using Information Retrieval. International Journal of Computer Applications. 178, 10 ( May 2019), 23-27. DOI=10.5120/ijca2019918827

@article{ 10.5120/ijca2019918827,
author = { Hemangini S. Patel, Apurva A. Desai },
title = { Link Analysis to discover relevant documents using Information Retrieval },
journal = { International Journal of Computer Applications },
issue_date = { May 2019 },
volume = { 178 },
number = { 10 },
month = { May },
year = { 2019 },
issn = { 0975-8887 },
pages = { 23-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume178/number10/30567-2019918827/ },
doi = { 10.5120/ijca2019918827 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:50:01.259756+05:30
%A Hemangini S. Patel
%A Apurva A. Desai
%T Link Analysis to discover relevant documents using Information Retrieval
%J International Journal of Computer Applications
%@ 0975-8887
%V 178
%N 10
%P 23-27
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In recent year’s growth of World Wide Web become faster to cross all expectations; World Wide Web is becoming most valuable resources to information retrieval and knowledge discovery from Web. It is a fertile area for web mining research; an emerging challenge for web mining is the problem of mining richly qualitative documents, where the objects are linked via multiple types of relations. These links provide additional context that can be helpful for web mining tasks. Traditional link analysis treats all hyperlinks equally and makes the assumption that links are endorsement, so there is need to only extract links which are valuable. Unfortunately, this assumption does not incorporate in present World-Wide Web. Hyperlinks are not identical; they may be created in different contexts and for different purposes. By using novel characteristics of web page and hyperlinks to help a search engine focus on relevant and high quality content. The important hyperlink features—topicality —is proposed.

References
  1. Pandia M., Pani S.K., Padhi S.K., Panigrahy L. and Ramakrishna R. 2011. A Review Of Trends In Research On Web Mining, International Journal of Instrumentation, Control & Automation (IJICA), 1(1), 37-41.
  2. Srinivas K., Reddy L.K.K. and Govardhan A. 2010. A Theoretical Approach to Link Mining for personalization, International Journal of Computer Science Issues, 7(3), 41-42.
  3. Getoor L. and Diehl C. P. 2005. Link Mining: A Survey, SIGKDD Explorations, 7(2), 3-12.
  4. Wu M., Scholer F. and Turpin A. 2011. Topic Distillation with Query-Dependent Link Connections and Page Characteristics, ACM Transactions on the Web, 5(2), 3-25.
  5. Tsikrika T. and Lalmas M. 2005. Best Entry Pages for the Topic Distillation Task, qeen marry, university of London.
  6. Jain R. and Purohit G. N. 2011. Page Ranking Algorithms for Web Mining, International Journal of Computer Applications (0975 – 8887), 13(5),22-25.
  7. Katz V. and Li W.S. 1999. Topic Distillation on hierarchically categorized Web Documents. In Proc. of the 1999 Workshop on Knowledge and Data Engineeering Exchange, IEEE.
  8. Gupta M., Tomar V., Verma J. And Roy S. 2011. Mining databases on world wide web, IJCSI, 560-564.
  9. Page L., Brin S., Motwani R. and Winograd T.1998. The PageRank citation ranking: Bringing order to the Web. Unpublished draft.
  10. Kleinberg J. M. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604–632.
  11. Bharat K. and Henzinger M. R. 1998. Improved Algorithms for Topic Distillation in a Hyperlinked Environment. Proc. of 21th ACM SIGIR Conf. on Research and Development in Information Retrieval, 104-111.
  12. Chakrabarti S., Joshi M. and Tawde V. 2001. Enhanced Topic Distillation using Text, Markup Tags, and Hyperlinks. Proc. of 24th ACM SIGIR Conf. on Research and Development in Information Retrieval, 208-216.
  13. Rafiei D. and Mendelzon A.O. 2000, What is this Page Known for? Computing Web Page Reputations, In Proceedings of Ninth International WWW Conference, Amsterdam.
  14. Haveliwala T. 2002. Topic-Sensitive PageRank. In Proc. of the 11th International World Wide Web Conference, Honululu, Hawaii.
  15. Choi I. and Kim M. 2003. Topic Distillation using Hierarchy Concept Tree, SIGIR’03, Toronto, Canada, ACM 1-58113-646-3/03/0007,371-372.
  16. Eiron N. and McCurley K. S. 2003. Analysis of Anchor Text for Web Search. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. Toronto, Canada, 459-460.
  17. Zhong J. K., Zhao L., Qiong W. Y. and Zhong G. J. 2008. An Algorithm of Topic Distillation Based on Anchor Text, International Symposium on Electronic Commerce and Security, IEEE computer society, DOI 10.1109/ISECS:11-15.
  18. Lempel R. and Moran S. 2000. The stochastic approach for link-structure analysis (SALSA) and the TKC effect. In Proceedings of the 9th International World Wide Web Conference, Athens, Greece.
  19. Fujii A. 2008. Modeling Anchor Text and Classifying Queries to Enhance Web Document Retrieval, WWW 2008 / Refereed Track: Search - Query Analysis, Beijing, China, 21-25
  20. Craswell N., Hawking D. and Robertson S. 2001. Effective Site Finding using Link Anchor Information SIGIR’01, New Orleans, Louisiana, USA. ACM 1-58113-331-6/01/0009, 250-257.
Index Terms

Computer Science
Information Sciences

Keywords

Web Mining Web Structure Mining Topicality Information Retrieval Link Analysis Anchor Text WWW.