A Technique of Data Fusion for Effective Text Retrieval

Manjusha Sanke

Call for Paper

October Edition

IJCA solicits high quality original research papers for the upcoming October edition of the journal. The last date of research paper submission is 22 September 2025

Submit your paper

Know more

The week's pick

Real-Time Video Transmission using Gaussian Minimum Shift Keying (GMSK) on GNU Radio and USRP for Radiation Monitoring Applications in Nuclear Reactors

Nabiha Ben Abid Abdalla M. Khattab Hani A.M. Harb Chokri Souani

Random Articles

Reseach Article

A Technique of Data Fusion for Effective Text Retrieval

by Manjusha Sanke

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 111 - Number 8

Year of Publication: 2015

Authors: Manjusha Sanke

10.5120/19556-1303

Manjusha Sanke . A Technique of Data Fusion for Effective Text Retrieval. International Journal of Computer Applications. 111, 8 ( February 2015), 5-9. DOI=10.5120/19556-1303

@article{ 10.5120/19556-1303,

author = { Manjusha Sanke },

title = { A Technique of Data Fusion for Effective Text Retrieval },

journal = { International Journal of Computer Applications },

issue_date = { February 2015 },

volume = { 111 },

number = { 8 },

month = { February },

year = { 2015 },

issn = { 0975-8887 },

pages = { 5-9 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume111/number8/19556-1303/ },

doi = { 10.5120/19556-1303 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:47:18.707902+05:30

%A Manjusha Sanke

%T A Technique of Data Fusion for Effective Text Retrieval

%J International Journal of Computer Applications

%@ 0975-8887

%V 111

%N 8

%P 5-9

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The goal of Information retrieval systems is to provide useful information for user's information need. For a collection of documents and a given query, an IR system returns a ranked list of documents. Different IR systems based on IR models such as Vector Space, Smart Vector Space, Extended Boolean, Latent Semantic Indexing etc. produce different text documents for the same query. They rarely return the same documents in response to the same queries. This has led to the field of "data fusion", which seeks to improve the quality of results being presented to user, by combining the outputs of multiple IR algorithms or systems into a single result set. CombMNZ is a score-based fusion algorithm which adds all the reported scores for a document and multiplies the sum value to the number of retrieval models that have returned that document. This paper focuses on Norm_CombMNZ algorithm which normalizes the result obtained from CombMNZ, so that scores lie in 0 to 1 common range and better ranking judgment can be made. The performance of individual IR system is compared with the performance of data fusion system using performance measures such as recall and precision. The graphical result shows that Norm_CombMNZ provides fused resulting text documents to the user, in the form of effective text retrieval.

References

Mohammad Othman Nassar, Ghassan Kanaan, "The Factors Affecting the Performance of Data Fusion Algorithms," icime, pp. 465-470, 2009 International Conference on Information Management and Engineering, published by IEEE press, 2009, ISBN:978-0-7695-3595-1.
Javed Aslam, and Mark Montague, "Models for Metasearch," In Proc. ACM SIGIR 2001 Conf. , ACM press, New Orleans, Louisiana, 2001, pp. 276-284.
Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co. , Inc. , Boston,USA, 1999.
Christopher Vogt and Garrison Cottrell,"Fusion via a linear combination of scores," Information Retrieval, 1(3), Oct. 1999, pp. 151-173.
Bartell, B. T. , Cottrell, G. W. , & Belew, R. K. , "Automatic combination of multiple ranked retrieval systems" in Proceedings of ACM SIGIR conference (p. 173-184), 1994, Dublin, Ireland.
Belkin, N. J. , Kantor, P. , Fox, E. A. , & Shaw, J. A. "Combining evidence of multiple query representations for information retrieval". Information Processing & Management, 1995, 31 (3): 431-448.
Montague, M. , & Aslam, J. A. , "Condorcet fusion for improved retrieval", in Proceedings of ACM CIKM conference (p. 538-548), 2002, McLean, VA, USA.
David Lillis, Fergus Toolan, Rem Collier, and John Dunnion, "ProbFuse: a probabilistic approach to Data-fusion," in Proc. 29th ACM SIGIR conf. , ACM press, Seattle, Washington, USA, 2006, pp. 139-146.
Hai Dong, Farookh Khadeer Hussain,Elizabeth Chang "A Survey in Traditional Information Retrieval Models", 2008 Second IEEE International Conference on Digital Ecosystems and Technologies.
Shengli Wu, "Applying statistical principles to data fusion in information retrieval," Expert Systems with Applications: An International Journal, Volume 36 , Issue 2, March 2009,Pergamon Press, pp. 2997-3006.
Shengli Wu, Fabio Crestani, and Yaxin Bi, "Evaluating score normalization methods in Data-fusion," Springer Berlin, 2006, pp. 642–648.
D. Frank Hsu and Isak Taksa"Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval", 2005 Springer Science + Business Media, Inc.
Shengli Wu and Sally McClean, "Performance prediction of data fusion for information retrieval," Information Processing and Management, Vol. 42, Issue 4, Elsevier, 2006, pp. 899-915.
Beitzel, S. , Jensen, E. , Chowdhury, A. , Grossman,D. , Frieder, O. , & Goharian, N. "On fusion of effective retrieval strategies in the same information retrieval system", Journal of the American Society of Information Science and Technology, 2004: 55 (10), 859-868.
Martin F. Porter. An algorithm for suffix stripping. Pages 313–316, 1997.

Index Terms

Computer Science

Information Sciences

Keywords

Data fusion information retrieval performance measures IR models.