International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 20 - Number 3 |
Year of Publication: 2011 |
Authors: Manpreet Singh, Gurvinder Singh |
10.5120/2414-3226 |
Manpreet Singh, Gurvinder Singh . Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction. International Journal of Computer Applications. 20, 3 ( April 2011), 22-27. DOI=10.5120/2414-3226
In the present paper, the cluster analysis as a form of unsupervised learning is implemented for human protein class prediction. The data related to human protein is accessed from Human Protein Reference Database (HPRD). From HPRD, the sequences related to ten molecular classes are obtained. For each of the molecular class five amino acid sequences are obtained. Then with the help of various web based tools, SDFs (Sequence derived Features) are extracted for each sequence. By analyzing the variation in the values of the obtained SDFs, priorities are assigned to them. Because each sequence has some value for each of the SDF, so obtained data is a complete weighted bipartite graph consisting of two independent set of nodes i.e. one set of all the sequences and second of all SDFs. Then bipartite graph is represented into the memory with adjacency weight matrix. On the basis of values of input SDFs and by considering priority of each of the SDF, clusters of the data available in the adjacency matrix are generated. Then those clusters are backtracked to predict the class of the entered sequence.