CFP last date
20 January 2025
Reseach Article

A Novel Feature Selection Method for Classification of Medical Documents from Pubmed

by S.Sagar Imambi, T.Sudha
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 26 - Number 9
Year of Publication: 2011
Authors: S.Sagar Imambi, T.Sudha
10.5120/3131-4315

S.Sagar Imambi, T.Sudha . A Novel Feature Selection Method for Classification of Medical Documents from Pubmed. International Journal of Computer Applications. 26, 9 ( July 2011), 29-33. DOI=10.5120/3131-4315

@article{ 10.5120/3131-4315,
author = { S.Sagar Imambi, T.Sudha },
title = { A Novel Feature Selection Method for Classification of Medical Documents from Pubmed },
journal = { International Journal of Computer Applications },
issue_date = { July 2011 },
volume = { 26 },
number = { 9 },
month = { July },
year = { 2011 },
issn = { 0975-8887 },
pages = { 29-33 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume26/number9/3131-4315/ },
doi = { 10.5120/3131-4315 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:12:37.811155+05:30
%A S.Sagar Imambi
%A T.Sudha
%T A Novel Feature Selection Method for Classification of Medical Documents from Pubmed
%J International Journal of Computer Applications
%@ 0975-8887
%V 26
%N 9
%P 29-33
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The exponential growth of online repositories in medical science has led to the development of various text mining tool . Theses tools assist the users in analyzing text data stored in the online repositories like Pubmed and medline. The pubmed repositories are growing at the rate of 500000 articles per year. Classification of Medline documents becomes very complex due to high dimensionality of feature space. In this study we discussed how dimensionality is reduced. We study and compared various dimensionality reduction techniques at the preprocessing stage. We introduce a novel feature weighting scheme ‘GRW ‘ and proved that this schema improves classification accuracy. Our experimental results indicate that existing feature weighting methods has less accuracy rate when compared to GRW schema and tested on medical data set.

References
  1. Fabrizio Sebastiani, Macine learning in Automated text categorization ,ACM Computing Surveys, VOL34,No 1(2002), pp 1-47
  2. J Novovicova et al , Feature selection using Improved Mutual Information for Text classification’ ,SSPR & SPR(2004),pp 1010-1017
  3. K.Perpinani Why IDF? ,In NAACL 01,Second meeting of the North American Chapter of the Association of Computational Linguistics on Language Technologies (2001), pp 1-8
  4. Lecture 2,More Similarity searching Multidimensional scaling 36-350,Data mining ,2009.
  5. L.Song A. Smola et al ,Supervsied Feature Selection via dependence estimation , In International conference on Machine Learning 2007
  6. Ng et al ,Examining the role of Linguistic Knowledge sources in the automatic identification and classification Reviews, In proceedings of COLING /ACL,2006.
  7. Robertson et al, Understanding IDF on theoretical arguments for IDF ,Journal of Documentation ,5:503-520,2004
  8. Ronen Feldman, James Sange, The Text mining Handbook, Cambridge University Press(2007).
  9. S.Sagar Imambi, T.Sudha - A Unified frame work for searching Digital libraries Using Document Clustering –International Journal of Computational Mathematical ideas Vol 2-No1-(2010) ,pp 28-32
  10. Ranjit Abraham et al, Medical Data mining with a new algorithm for Feature selection and Navie Bayesian Classification IEEE 10th International Conference on Information Technology, 2007.
  11. S.Sagar Imambi, T.Sudha-Clinical Decision Support System for Heart Patients-International Journal of Computer Science, System Engineering and Information Technology, Vol 2-No2. (2009), pp 165-169
  12. Shoushan Li et al , ‘A frame work of feature Selection Methods for Text categorization’ ,Proceedings of 47th Annual meeting of ACL & 4th ICCNLP of AFNLP (2009), pp 692-700.
  13. S.Sagar Imambi, T.Sudha- Classification of Medline documents using Global Relevant Weighing Schema’, International Journal of computer Applications 16(3) February 2011, pp 45–48
  14. Sima C and Dougherty E ‘What should be expected from Feature selection in small sample settings ,Bio Informatics 22 (2006), pp 2430-2436
  15. S.Sagar Imambi, T.Sudha -.Building Classification System to Predict Risk factors of Diabetic Retinopathy Using Text mining - International Journal on Computer Science and Engineering Vol. 02, No. 07 (2010) ,pp 2309-2312
  16. Uğuz H.,A hybrid system based on information gain and principal component analysis for the classification of transactional Doppler signals, Department of Computer Engineering, Selçuk University, Konya, Turkey., 2011
  17. Yang.y & Pedersen J.O, A comparative study on Feature Selection in Text categorization , 14th Proceedings of 14th International conference on Machine learning 1997.
Index Terms

Computer Science
Information Sciences

Keywords

Document Classification Feature Selection Pubmed Text mining