International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 1 - Number 22 |
Year of Publication: 2010 |
Authors: Yogesh Singh, Arvinder Kaur, Ruchika Malhotra |
10.5120/525-685 |
Yogesh Singh, Arvinder Kaur, Ruchika Malhotra . Prediction of Fault-Prone Software Modules using Statistical and Machine Learning Methods. International Journal of Computer Applications. 1, 22 ( February 2010), 6-13. DOI=10.5120/525-685
Demand for producing quality software has rapidly increased during the last few years. This is leading to increase in development of machine learning methods for exploring data sets, which can be used in constructing models for predicting quality attributes such as fault proneness, maintenance effort, testing effort, productivity and reliability. This paper examines and compares logistic regression and six machine learning methods (Artificial neural network, decision tree, support vector machine, cascade correlation network, group method of data handling polynomial method, gene expression programming). These methods are explored empirically to find the effect of static code metrics on the fault proneness of software modules. We use publicly available data set AR1 to analyze and compare the regression and machine learning methods in this study. The performance of the methods is compared by computing the area under the curve using Receiver Operating Characteristic (ROC) analysis. The results show that the area under the curve (measured from the ROC analysis) of model predicted using decision tree modeling is 0.865 and is a better model than the model predicted using regression and other machine learning methods. The study shows that the machine learning methods are useful in constructing software quality models.