International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 16 - Number 5 |
Year of Publication: 2011 |
Authors: Dr.V.Anuradha, S.K.M.Habeeb, A.Praveena, AmalaPriya |
10.5120/2009-2710 |
Dr.V.Anuradha, S.K.M.Habeeb, A.Praveena, AmalaPriya . A New GC Based HMM Algorithm for Disease Classification. International Journal of Computer Applications. 16, 5 ( February 2011), 19-22. DOI=10.5120/2009-2710
This paper presents a hidden markov model which classifies proteins into classes: the normal protein and the diseased proteins. Using a dataset of 50 protein sequences, the method was able to classify the proteins with a better accuracy of 81%. We used the HMM based software called Matlab to train the data. Matlab uses some of the HMM functions to classify the normal and diseased proteins based with the 16 combinations of amino acids. First the patterns are extracted using 2-gram amino acid encoding method. Here we have 16 patterns which codes for GC. Then scores of these 16 patterns are given as an input for hidden markov model. The hidden markov model was trained on two classes of the proteins based on the known patterns and the trained model was used to classify the dataset. Therefore, the method was able to classify the proteins with an accuracy of 81%. The results of this algorithm provide insights that can help biologists and computer scientists design high-performance protein classification systems of high quality.