CFP last date
20 March 2025
Reseach Article

A Comparison Between Selective Collection Enrichment and Results Merging in Patient Centered Health Information Retrieval

by Edwin Thuma, Onneile G. Tibi, Gontlafetse Mosweunyane
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 180 - Number 29
Year of Publication: 2018
Authors: Edwin Thuma, Onneile G. Tibi, Gontlafetse Mosweunyane

Edwin Thuma, Onneile G. Tibi, Gontlafetse Mosweunyane . A Comparison Between Selective Collection Enrichment and Results Merging in Patient Centered Health Information Retrieval. International Journal of Computer Applications. 180, 29 ( Mar 2018), 1-8. DOI=10.5120/ijca2018916691

@article{ 10.5120/ijca2018916691,
author = { Edwin Thuma, Onneile G. Tibi, Gontlafetse Mosweunyane },
title = { A Comparison Between Selective Collection Enrichment and Results Merging in Patient Centered Health Information Retrieval },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2018 },
volume = { 180 },
number = { 29 },
month = { Mar },
year = { 2018 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { },
doi = { 10.5120/ijca2018916691 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-07T01:02:07.232225+05:30
%A Edwin Thuma
%A Onneile G. Tibi
%A Gontlafetse Mosweunyane
%T A Comparison Between Selective Collection Enrichment and Results Merging in Patient Centered Health Information Retrieval
%J International Journal of Computer Applications
%@ 0975-8887
%V 180
%N 29
%P 1-8
%D 2018
%I Foundation of Computer Science (FCS), NY, USA

In this article, an empirical investigation was conducted to determine whether merging search results generated by multiple query variants with the same information need can improve the retrieval performance in patient centered health information retrieval. In addition, this approach was compared with the selective collection enrichment approach, where only the results generated by a single query, which was predicted to perform better on the local collection is used. Three different results merging strategies predominantly used in distributed search environments with large overlapping databases were used in this study. The results of this investigation suggests that merging results using multiple query variants with the same information need can improve the retrieval performance. Also it was observed that the choice of an external collection used in generating these query variants can have an impact in the retrieval performance as it can sometimes lead to a degradation in the retrieval performance. When a comparison was made between results merging strategies and the selective collection enrichment approach, it was observed that the selective collection enrichment approach ranks fewer and highly relevant documents in the top 10 retrieved documents while the results merging strategies ranks more and slightly relevant documents in the top 10 retrieved documents.

  1. P. Bailey, A. Moffat, F. Scholer, and P. Thomas. User variability and ir system evaluation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, pages 625– 634, New York, NY, USA, 2015. ACM.
  2. G. Zuccon, J. Palotti, and A. Hanbury. Query variations and their effect on comparing information retrieval systems. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM ’16, pages 691–700, New York, NY, USA, 2016. ACM.
  3. O. Tibi, E. Thuma, and G. Mosweunyane. Selective collection enrichment in user-centred health information retrieval. In 2017 1st International Conference on Next Generation Computing Applications (NextComp), pages 175–181, July 2017.
  4. Dongqing Zhu, Stephen T. Wu, James J. Masanz, Ben Carterette, and Hongfang Liu. Using discharge summaries to improve information retrieval in clinical domain. In Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013., 2013.
  5. L. Kelly, L. Goeuriot, H. Suominen, T. Schreck, G. Leroy, D. L. Mowery, S. Velupillai, W. W. Chapman, D. Martinez, G. Zuccon, and J. Palotti. Overview of the share/clef ehealth evaluation lab 2014. In Evangelos Kanoulas, Mihai Lupu, Paul Clough, Mark Sanderson, Mark Hall, Allan Hanbury, and Elaine Toms, editors, Information Access Evaluation. Multilinguality, Multimodality, and Interaction, pages 172– 191, Cham, 2014. Springer International Publishing.
  6. L. Goeuriot, L. Kelly, H. Suominen, L. Hanlen, A. N´ev´eol, C. Grouin, J. Palotti, and G. Zuccon. Overview of the clef ehealth evaluation lab 2015. In Josanne Mothe, Jacques Savoy, Jaap Kamps, Karen Pinel-Sauvagnat, Gareth Jones, Eric San Juan, Linda Capellato, and Nicola Ferro, editors, Experimental IR Meets Multilinguality, Multimodality, and Interaction, pages 429–443, Cham, 2015. Springer International Publishing.
  7. G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In Sharon McDonald and John Tait, editors, Advances in Information Retrieval, volume 2997, pages 127–137, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg.
  8. L. Azzopardi. Query side evaluation: An empirical analysis of effectiveness and effort. In Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’09, pages 556–563, New York, NY, USA, 2009. ACM.
  9. J. P. Callan, Z. Lu, andW. B. Croft. Searching distributed collections with inference networks. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’95, pages 21–28, New York, NY, USA, 1995. ACM.
  10. J. Callan. Distributed information retrieval. InW. Bruce Croft, editor, Advances in information retrieval, pages 127–150. Kluwer, 2000.
  11. D. Hawking, N. Craswell, P. B. Thistlewaite, and D. Harman. Results and challenges in web search evaluation. Computer Networks, 31(11-16):1321–1330, 1999.
  12. E.M. Voorhees, N. K. Gupta, and B. Johnson-Laird. Learning collection fusion strategies. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’95, pages 172– 179, New York, NY, USA, 1995. ACM.
  13. S. Wu and F. Crestani. Shadow document methods of resutls merging. In Proceedings of the 2004 ACM Symposium on Applied Computing, SAC ’04, pages 1067–1072, New York, NY, USA, 2004. ACM.
  14. S. Wu and S. I. McClean. Result merging methods in distributed information retrieval with overlapping databases. Inf. Retr., 10(3):297–319, 2007.
  15. J.H. Lee. Analyses of multiple evidence combination. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’97, pages 267–276, New York, NY, USA, 1997. ACM.
  16. J. Arguello, J.L. Elsas, J. Callan, and J.G. Carbonell. Document representation and query expansion models for blog recommendation. In Proceedings of the Second International Conference on Weblogs and Social Media, ICWSM 2008, Seattle, Washington, USA, March 30 - April 2, 2008, 2008.
  17. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. A framework for selective query expansion. In Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, DC, USA, November 8-13, 2004, pages 236–237, 2004.
  18. K. L. Kwok and M. Chan. Improving two-stage ad-hoc retrieval for short queries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’98, pages 250– 256, New York, NY, USA, 1998. ACM.
  19. B. He and I. Ounis. Inferring query performance using preretrieval predictors. In String Processing and Information Retrieval, 11th International Conference, SPIRE 2004, Padova, Italy, October 5-8, 2004, Proceedings, pages 43–54, 2004.
  20. L. Goeuriot, G.J.F. Jones, L. Kelly, J. Leveling, A. Hanbury, H. M¨uller, S. Salanter¨a, H. Suominen, and G. Zuccon. Share/clef ehealth evaluation lab 2013, task 3: Information retrieval to address patients’ questions when reading clinical reports. In CLEF 2013 Evaluation Labs and Workshop, Valencia, Spain, 2013.
  21. T. Chappell and S. Geva. Working notes for topsig at share/clef ehealth 2013. In Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013., 2013.
  22. X. Zhong, Y. Xia, Z. Xie, S. Na, Q. Hu, and Y. Huang. Concept-based medical document retrieval: THCIB at CLEF ehealth lab 2013 task 3. In Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013., 2013.
  23. L. Goeuriot, L. Kelly, W. Li, J. Palotti, P. Pecina, G. Zuccon, A. Hanbury, G.J. F. Jones, and H. M¨uller. Share/clefehealth evaluation lab 2014, task 3: User-centred health information retrieval. In Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014., pages 43–61, 2014.
  24. W. Shenwei, J.-Y. Nie, X. Liu, and X. Liu. An investigation of the effectiveness of concept-based approach in medical information retrieval. In Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014., pages 236–247, 2014.
  25. R. W. White and E. Horvitz. Cyberchondria: Studies of the escalation of medical concerns in web search. ACM Trans. Inf. Syst., 27(4):23:1–23:37, November 2009.
  26. J. Palotti, G. Zuccon, L. Goeuriot, L. Kelly, A. Hanburyn, G. J.F. Jones, M. Lupu, and P. Pecina. CLEF eHealth evaluation lab 2015, task 2: Retrieving information about medical symptoms. In CLEF 2015 Online Working Notes. CEUR-WS, September 2015.
  27. Y. Song, Y. He, Q. Hu, L. He, and E. M. Haacke. ECNU at 2015 ehealth task 2: User-centred health information retrieval. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015., 2015.
  28. L. Kelly, L. Goeuriot, H. Suominen, A. N´ev´eol, J. Palotti, and G. Zuccon. Overview of the CLEF ehealth evaluation lab 2016. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - 7th International Conference of the CLEF Association, CLEF 2016, E´vora, Portugal, September 5-8, 2016, Proceedings, pages 255–266, 2016.
  29. G. Zuccon, J. Palotti, L. Goeuriot, L. Kelly, M. Lupu, P. Pecina, H. M¨uller, J. Budaher, and A. Deacon. The IR task at the CLEF ehealth evaluation lab 2016: User-centred health information retrieval. In Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, E´vora, Portugal, 5-8 September, 2016., pages 15–27, 2016.
  30. S. Saleh and P. Pecina. Task3 patient-centred information retrieval: Team CUNI. In Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, E´vora, Portugal, 5-8 September, 2016., pages 123–129, 2016.
  31. Y. Song, Y. He, H. Liu, Y. Wang, Q. Hu, and L. He. ECNU at 2016 ehealth task 3: Patient-centred information retrieval. In Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, E´vora, Portugal, 5-8 September, 2016., pages 157–161, 2016.
  32. E. Thuma, N. P. Motlogelwa, and T. Leburu-Dingalo. Task 3: Patient-centered information retrieval, irtask 1: ad-hoc search - TEAM ub-botswana. InWorking Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, E´vora, Portugal, 5-8 September, 2016., pages 162–166, 2016.
  33. C.W. Cleverdon, J. Mills, and M.E. Keen. Aslib cranfield research project - factors determining the performance of indexing systems; volume 2, test results. Technical report, Cranfield University, England, UK, Technical Report 1966, 1966.
  34. I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proceedings of ACM SIGIR’ 06 Workshop on Open Source Information Retrieval (OSIR 2006), 2006.
  35. K. Roberts, M. S. Simpson, E. M. Voorhees, andW. R. Hersh. Overview of the TREC 2015 clinical decision support track. In Proceedings of The Twenty-Fourth Text REtrieval Conference, TREC 2015, Gaithersburg, Maryland, USA, November 17-20, 2015, 2015.
  36. B. Koopman and G. Zuccon. Relevation!: An open source system for information retrieval relevance assessment. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR ’14, pages 1243–1244, New York, NY, USA, 2014. ACM.
  37. G. Zuccon and B. Koopman. Integrating understandability in the evaluation of consumer health search engines. In Proceedings of the Medical Information Retrieval Workshop at SIGIR co-located with the 37th annual international ACM SIGIR conference (ACM SIGIR 2014), Gold Coast, Australia, July 11, 2014., pages 32–35, 2014.
Index Terms

Computer Science
Information Sciences


Query Variants Distributed Information Retrieval Results Merging