International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 135 - Number 5 |
Year of Publication: 2016 |
Authors: Ashish Shah |
10.5120/ijca2016908385 |
Ashish Shah . Prediction of Malignant and Benign Tumor using Machine Learning. International Journal of Computer Applications. 135, 5 ( February 2016), 19-23. DOI=10.5120/ijca2016908385
Machine Learning is a branch of Computer Science that is concerned with designing systems that can learn from the provided input. Supervised Machine Learning is where the system needs to be first trained using already classified training data as opposed to an unsupervised system where no such training is required. Supervised learning comprises of 2 training techniques. Linear Regression predicts a continuous valued output. Logistic Regression, more commonly known as Classification predicts a discrete valued output. It is the algorithm for identifying to which of a set of categories a new observation belongs. In this paper we aim to assess whether a lump in a breast could be malignant (cancerous) or benign (non-cancerous) by Classification. The 2 features under consideration are Clump Thickness and Marginal Adhesion. Clump Thickness helps us detect cancerous cells as they are often grouped in multilayers whereas benign cells tend to be grouped in monolayers. Normal cells tend to stick together but Cancerous cells tend to lose this ability. So loss of Marginal Adhesion is a sign of malignancy. With the help of the sigmoid function, we find the Cost function of our data and minimize the sum of the squared errors over the training set. Using Gradient Descent we find the global minimum of our Cost function and then calculate the parameters that fit our data. Finally we estimate the probability of the patient’s tumor being malignant or benign based on the values of these 2 features and the parameters.