CFP last date
20 January 2025
Reseach Article

Framework for Document Retrieval using Latent Semantic Indexing

by Neelam Phadnis, Jayant Gadge
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 94 - Number 14
Year of Publication: 2014
Authors: Neelam Phadnis, Jayant Gadge
10.5120/16414-6065

Neelam Phadnis, Jayant Gadge . Framework for Document Retrieval using Latent Semantic Indexing. International Journal of Computer Applications. 94, 14 ( May 2014), 37-41. DOI=10.5120/16414-6065

@article{ 10.5120/16414-6065,
author = { Neelam Phadnis, Jayant Gadge },
title = { Framework for Document Retrieval using Latent Semantic Indexing },
journal = { International Journal of Computer Applications },
issue_date = { May 2014 },
volume = { 94 },
number = { 14 },
month = { May },
year = { 2014 },
issn = { 0975-8887 },
pages = { 37-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume94/number14/16414-6065/ },
doi = { 10.5120/16414-6065 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:17:41.123333+05:30
%A Neelam Phadnis
%A Jayant Gadge
%T Framework for Document Retrieval using Latent Semantic Indexing
%J International Journal of Computer Applications
%@ 0975-8887
%V 94
%N 14
%P 37-41
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Today, with the rapid development of the Internet, textual information is growing rapidly. So document retrieval which aims to find and organize relevant information in text collections is needed. With the availability of large scale inexpensive storage the amount of information stored by organizations will increase. Searching for information and deriving useful facts will become more cumbersome. How to extract a lot of information quickly and effectively has become the focus of current research and hot topics. The state of the art for traditional IR techniques is to find relevant documents depending on matching words in users' query with individual words in text collections. The problem with Content-based retrieval systems is that documents relevant to a users' query are not retrieved, and many unrelated or irrelevant materials are retrieved. In this paper information retrieval method is proposed based on LSI approach. Latent Semantic Indexing (LSI) model is a concept based retrieval method that exploits the idea of vector space model and singular value decomposition. The goal of this research is to evaluate the applicability of LSI technique for textual document search and retrieval.

References
  1. Todd A Letsche, Micheal W Berry. "Large Scale Information Retrieval with Latent Semantic Indexing". Information Sciences 1997.
  2. Singhal, Amit. "Modern information retrieval: A brief overview. " IEEE Data Eng. Bull. 24. 4 (2001): 35-43.
  3. Deerwester, Scott C. , et al. "Indexing by latent semantic analysis. " JASIS 41. 6 (1990): 391-407.
  4. Roger Bradford. "Why LSI? Latent Semantic Indexing and Information Retrieval" 2009 Content Analyst Company.
  5. Sadjirin, Roslan, and Nurazzah Abd Rahman. "Efficient retrieval of Malay language documents using Latent Semantic Indexing. " Information Technology (ITSim), 2010 International Symposium in. Vol. 3. IEEE, 2010.
  6. Aswani Kumar, Ch, and Suripeddi Srinivas. "Latent semantic indexing using eigenvalue analysis for efficient information retrieval. " International Journal of Applied Mathematics and Computer Science 16 (2006): 551-558.
  7. Rodrigues, Ravina, and Kavita Asnani. "Concept based search using LSI and automatic keyphrase extraction. " Emerging Trends in Engineering and Technology (ICETET), 2010 3rd International Conference on. IEEE, 2010.
  8. Yang, Jianxiong, and Junzo Watada. "Decomposition of term-document matrix representation for clustering analysis. " Fuzzy Systems (FUZZ), 2011 IEEE International Conference on. IEEE, 2011.
  9. Zaman, A. N. K. , and Charles Grant Brown. "Latent semantic indexing and large dataset: Study of term-weighting schemes. " Digital Information Management (ICDIM), 2010 Fifth International Conference on. IEEE, 2010.
  10. Porter, Martin F. "An algorithm for suffix stripping. " Program: electronic library and information systems 14. 3 (1980): 130-137.
  11. Symeonidis, Panagiotis, Ivaylo Kehayov, and Yannis Manolopoulos. "Text classification by aggregation of SVD eigenvectors. " Advances in Databases and Information Systems. Springer Berlin Heidelberg, 2012.
  12. Zhao, Rong, and William I. Grosky. "Narrowing the semantic gap-improved text-based web document retrieval using visual features. " Multimedia, IEEE Transactions on 4. 2 (2002): 189-200.
  13. Berry, Michael W. , Susan T. Dumais, and Todd A. Letsche. "Computational methods for intelligent information access. " Supercomputing, 1995. Proceedings of the IEEE/ACM SC95 Conference. IEEE, 1995.
  14. Landauer, Thomas K. , Peter W. Foltz, and Darrell Laham. "An introduction to latent semantic analysis. " Discourse processes 25. 2-3 (1998): 259-284.
  15. Furnas, George W. , et al. "Information retrieval using a singular value decomposition model of latent semantic structure. " Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 1988.
  16. Dumais, Susan T. , et al. "Using latent semantic analysis to improve access to textual information. " Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 1988.
Index Terms

Computer Science
Information Sciences

Keywords

Document Retrieval Latent Semantic Indexing Singular value decomposition