CFP last date
20 February 2025
Reseach Article

Comparative Study Of Various Clustering Techniques For Software Quality Estimation System

Published on May 2012 by Sarul Suneja, Deepak Gupta
National Workshop-Cum-Conference on Recent Trends in Mathematics and Computing 2011
Foundation of Computer Science USA
RTMC - Number 7
May 2012
Authors: Sarul Suneja, Deepak Gupta

Sarul Suneja, Deepak Gupta . Comparative Study Of Various Clustering Techniques For Software Quality Estimation System. National Workshop-Cum-Conference on Recent Trends in Mathematics and Computing 2011. RTMC, 7 (May 2012), 31-35.

@article{
author = { Sarul Suneja, Deepak Gupta },
title = { Comparative Study Of Various Clustering Techniques For Software Quality Estimation System },
journal = { National Workshop-Cum-Conference on Recent Trends in Mathematics and Computing 2011 },
issue_date = { May 2012 },
volume = { RTMC },
number = { 7 },
month = { May },
year = { 2012 },
issn = 0975-8887,
pages = { 31-35 },
numpages = 5,
url = { /proceedings/rtmc/number7/6672-1055/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Workshop-Cum-Conference on Recent Trends in Mathematics and Computing 2011
%A Sarul Suneja
%A Deepak Gupta
%T Comparative Study Of Various Clustering Techniques For Software Quality Estimation System
%J National Workshop-Cum-Conference on Recent Trends in Mathematics and Computing 2011
%@ 0975-8887
%V RTMC
%N 7
%P 31-35
%D 2012
%I International Journal of Computer Applications
Abstract

Software metrics and fault data belonging to a previous software version are used to build the software fault prediction model for the next release of the software. Until now, different classification algorithms have been used to build this kind of models. However there are certain cases when previous fault data are not present. In other words predicting the fault-proneness of program modules when the fault labels for modules are unavailable is a challenging task frequently arised in the software industry. Because fault data belonging to previous software version is not available, supervised learning approaches can not be applied, leading to the need for new methods, tools, or techniques. There is need to develop some methods to build the software fault prediction model based on unsupervised learning which can help to predict the fault –proneness of a program modules when fault labels for modules are not present. One of the such method is use of clustering techniques. This paper presents a case study of different clustering techniques and also compare the performance of these techniques.

References
  1. T. Menzies, J. Greenwald, and A. Frank, "Data mining static codeattributes to learn defect predictors", IEEE Transactions on SoftwareEngineering, vol. 32, no. 1, 2007, pp. 2-13.
  2. N. Seliya, T. M. Khoshgoftaar, "Software quality estimation withlimited fault data: a semi-supervised learning perspective", SoftwareQuality Journal, vol. 15, no. 3, 2007, pp. 327-344.
  3. C. Catal, B. Diri, "Investigating the effect of dataset size, metrics set,and feature selection techniques on software fault prediction problem",Information Sciences, vol. 179, no. 8, pp. 1040-1058, 2009.
  4. N. Seliya, "Software quality analysis with limited prior knowledge of faults", Graduate Seminar, Wayne State University, Department of Computer Science, 2006, Webpage:www. cs. wayne. edu/graduateseminars/gradsem_f06/Slides/seliya_wsu_talk. ppt
  5. C. Catal, B. Diri, "A systematic review of software fault predictions studies", Expert Systems with Applications, vol. 36, no. 4, pp. 7346-7354, 2009.
  6. S. Zhong, T. M. Khoshgoftaar, and N. Seliya, "Unsupervised learning for expert-based software quality estimation", Proc. of the 8th Intl. Symp. On High Assurance Systems Eng. , Tampa, FL, 2004, pp. 149-155.
  7. C. Catal, U. Sevim, B. Diri, "Clustering and metrics thresholds based software fault prediction of unlabeled program modules", 6th Int'l. Conference on Information Technology: New Generations, IEEE Computer Society, Las Vegas, Nevada, 2009.
  8. N. Seliya, T. M. Khoshgoftaar, "Software quality analysis of unlabeled program modules with semi-supervised clustering", IEEE Transactions on Systems, Man and Cybernetics-Part A: Systems and Humans, vol. 37, no. 2, 2007, pp. 201-211.
  9. G. Gan, C. Ma, J. Wu, "Data clustering: theory, algorithms, and applications", Society for Industrial and Applied Mathematics, Philadelphia, 2007.
  10. R. Xu, D. Wunsch, "Survey of clustering algorithms", IEEE Transactions on Neural Networks, vol. 16, no. 3, 2005, pp. 645-678.
  11. P. Berkhin, "Survey of clustering data mining techniques", Technical Report, Accrue Software, San Jose, California, 2002,www. ee. ucr. edu/~barth/EE242/clustering_survey. pdf
  12. D. Pelleg, A. Moore, "X-means: extending k-means with efficient estimation of the number of clusters", Proceedings of the 17th International Conference on Machine Learning, pp. 727-734, 2000, Stanford University, Stanford, CA, USA.
Index Terms

Computer Science
Information Sciences

Keywords

S/w Quality Clustering Various Clustering Approaches