International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 70 - Number 20 |
Year of Publication: 2013 |
Authors: Anirban Chakrabarty |
10.5120/12184-8240 |
Anirban Chakrabarty . A Framework for Medical Text Mining using a Novel Categorical Clustering Algorithm. International Journal of Computer Applications. 70, 20 ( May 2013), 19-25. DOI=10.5120/12184-8240
The fast growth of medical records provides new opening for meaningful information retrieval in clinical diagnosis and treatment. Although nursing and pathology records provide a complete account of patient's information they are not fully utilized while taking major decisions of surgery or chemo therapy on patients. This research proposes a Minimum spanning tree algorithm to develop k-clusters of training data related to different liver diseases which are validated using Silhouette coefficient. A text classification algorithm is developed using cluster centers as training samples which uses a similarity measure to classify the categorical data. Simulation results show that the algorithm proposed can lower the calculation complexity and improve the accuracy of established text classification algorithms like k-NN. This research can serve as a medical diagnosis tool for classifying patient records and reveal important vocabularies that characterize nursing and pathology records.