We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Technique of Data Fusion for Effective Text Retrieval

by Manjusha Sanke
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 111 - Number 8
Year of Publication: 2015
Authors: Manjusha Sanke
10.5120/19556-1303

Manjusha Sanke . A Technique of Data Fusion for Effective Text Retrieval. International Journal of Computer Applications. 111, 8 ( February 2015), 5-9. DOI=10.5120/19556-1303

@article{ 10.5120/19556-1303,
author = { Manjusha Sanke },
title = { A Technique of Data Fusion for Effective Text Retrieval },
journal = { International Journal of Computer Applications },
issue_date = { February 2015 },
volume = { 111 },
number = { 8 },
month = { February },
year = { 2015 },
issn = { 0975-8887 },
pages = { 5-9 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume111/number8/19556-1303/ },
doi = { 10.5120/19556-1303 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:47:18.707902+05:30
%A Manjusha Sanke
%T A Technique of Data Fusion for Effective Text Retrieval
%J International Journal of Computer Applications
%@ 0975-8887
%V 111
%N 8
%P 5-9
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The goal of Information retrieval systems is to provide useful information for user's information need. For a collection of documents and a given query, an IR system returns a ranked list of documents. Different IR systems based on IR models such as Vector Space, Smart Vector Space, Extended Boolean, Latent Semantic Indexing etc. produce different text documents for the same query. They rarely return the same documents in response to the same queries. This has led to the field of "data fusion", which seeks to improve the quality of results being presented to user, by combining the outputs of multiple IR algorithms or systems into a single result set. CombMNZ is a score-based fusion algorithm which adds all the reported scores for a document and multiplies the sum value to the number of retrieval models that have returned that document. This paper focuses on Norm_CombMNZ algorithm which normalizes the result obtained from CombMNZ, so that scores lie in 0 to 1 common range and better ranking judgment can be made. The performance of individual IR system is compared with the performance of data fusion system using performance measures such as recall and precision. The graphical result shows that Norm_CombMNZ provides fused resulting text documents to the user, in the form of effective text retrieval.

References
  1. Mohammad Othman Nassar, Ghassan Kanaan, "The Factors Affecting the Performance of Data Fusion Algorithms," icime, pp. 465-470, 2009 International Conference on Information Management and Engineering, published by IEEE press, 2009, ISBN:978-0-7695-3595-1.
  2. Javed Aslam, and Mark Montague, "Models for Metasearch," In Proc. ACM SIGIR 2001 Conf. , ACM press, New Orleans, Louisiana, 2001, pp. 276-284.
  3. Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co. , Inc. , Boston,USA, 1999.
  4. Christopher Vogt and Garrison Cottrell,"Fusion via a linear combination of scores," Information Retrieval, 1(3), Oct. 1999, pp. 151-173.
  5. Bartell, B. T. , Cottrell, G. W. , & Belew, R. K. , "Automatic combination of multiple ranked retrieval systems" in Proceedings of ACM SIGIR conference (p. 173-184), 1994, Dublin, Ireland.
  6. Belkin, N. J. , Kantor, P. , Fox, E. A. , & Shaw, J. A. "Combining evidence of multiple query representations for information retrieval". Information Processing & Management, 1995, 31 (3): 431-448.
  7. Montague, M. , & Aslam, J. A. , "Condorcet fusion for improved retrieval", in Proceedings of ACM CIKM conference (p. 538-548), 2002, McLean, VA, USA.
  8. David Lillis, Fergus Toolan, Rem Collier, and John Dunnion, "ProbFuse: a probabilistic approach to Data-fusion," in Proc. 29th ACM SIGIR conf. , ACM press, Seattle, Washington, USA, 2006, pp. 139-146.
  9. Hai Dong, Farookh Khadeer Hussain,Elizabeth Chang "A Survey in Traditional Information Retrieval Models", 2008 Second IEEE International Conference on Digital Ecosystems and Technologies.
  10. Shengli Wu, "Applying statistical principles to data fusion in information retrieval," Expert Systems with Applications: An International Journal, Volume 36 , Issue 2, March 2009,Pergamon Press, pp. 2997-3006.
  11. Shengli Wu, Fabio Crestani, and Yaxin Bi, "Evaluating score normalization methods in Data-fusion," Springer Berlin, 2006, pp. 642–648.
  12. D. Frank Hsu and Isak Taksa"Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval", 2005 Springer Science + Business Media, Inc.
  13. Shengli Wu and Sally McClean, "Performance prediction of data fusion for information retrieval," Information Processing and Management, Vol. 42, Issue 4, Elsevier, 2006, pp. 899-915.
  14. Beitzel, S. , Jensen, E. , Chowdhury, A. , Grossman,D. , Frieder, O. , & Goharian, N. "On fusion of effective retrieval strategies in the same information retrieval system", Journal of the American Society of Information Science and Technology, 2004: 55 (10), 859-868.
  15. Martin F. Porter. An algorithm for suffix stripping. Pages 313–316, 1997.
Index Terms

Computer Science
Information Sciences

Keywords

Data fusion information retrieval performance measures IR models.