CFP last date
20 January 2025
Reseach Article

Literature Survey of Clone Detection Techniques

by Sonam Gupta, P. C. Gupta
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 99 - Number 3
Year of Publication: 2014
Authors: Sonam Gupta, P. C. Gupta
10.5120/17355-7858

Sonam Gupta, P. C. Gupta . Literature Survey of Clone Detection Techniques. International Journal of Computer Applications. 99, 3 ( August 2014), 41-44. DOI=10.5120/17355-7858

@article{ 10.5120/17355-7858,
author = { Sonam Gupta, P. C. Gupta },
title = { Literature Survey of Clone Detection Techniques },
journal = { International Journal of Computer Applications },
issue_date = { August 2014 },
volume = { 99 },
number = { 3 },
month = { August },
year = { 2014 },
issn = { 0975-8887 },
pages = { 41-44 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume99/number3/17355-7858/ },
doi = { 10.5120/17355-7858 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:27:15.125481+05:30
%A Sonam Gupta
%A P. C. Gupta
%T Literature Survey of Clone Detection Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 99
%N 3
%P 41-44
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Code clones are the codes which have same code in the system and so it is difficult to locate all the same codes in the system when any change is to be done. Researchers have proved that almost 70% of the effort done during maintenance is just because of the occurrence the clones in the system. A number of approaches had been given earlier to detect various types of clones [39]. This paper presents the systematic literature review of all the detection approaches researched so far. Along with it this paper also gives the advantages to implement them and also all the defects due to which they were not able to completely detect the clones. It also gives a novel approach to automatically detect the clones irrespective of the matter that whether the code is in same order or any statement has been inserted, deleted or modified in the code fragment.

References
  1. Brenda S. Baker. Finding Clones with Dup: Analysis of an Experiment. IEEE Transactions on Software Engineering, Vol. 33(9): 608-621, September 2007.
  2. Brenda S. Baker. A Program for Identifying Duplicated Code. In Proceedings of Computing Science and Statistics: 24th Symposium on the Interface, Vol. 24:4957, March 1992.
  3. Brenda S. Baker. Parameterized diff. In Proceedings of the 10th ACM-SIAM Symposium on Discrete Algorithms (SODA'99), pp. 854-855, Baltimore, Maryland, USA, January 1999.
  4. Brenda S. Baker. On Finding Duplication in Strings and Software. Journal of Algorithms, 1993.
  5. Brenda Baker. On Finding Duplication and Near-Duplication in Large Software Systems. In Proceedings of the Second Working Conference on Reverse Engineering (WCRE'95), pp. 86-95, Toronto, Ontario, Canada, July 1995.
  6. Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, Kostas Kontogiannis. Measuring Clone Based Reengineering Opportunities. In Proceedings of the 6th International Software Metrics Symposium (METRICS'99), pp. 292-303, Boca Raton, Florida, USA, November 1999.
  7. Hamid Basit, Simon Pugliesi, William Smyth, Andrei Turpin, and Stan Jarzabek. Efficient Token Based Clone Detection with Flexible Tokenization. In Proceedings of the Joint Meeting of the European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE'07), pp. 513-515, Dubrovnik, Croatia, September 2007.
  8. Ira Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant Anna. Clone Detection Using Abstract Syntax Trees. In Proceedings of the 14th International Conference on Software Maintenance (ICSM'98), pp. 368-377, Bethesda, Maryland, November 1998.
  9. James Cordy, Thomas Dean, Nikita Synytskyy. Practical Language-Independent Detection of Near-Miss. In Proceedings of the 14th IBM Centre for Advanced Studies Conference (CASCON'04), pp. 1 - 12, Toronto, Ontario, Canada, October 2004.
  10. M. Datar, N. Immorlica, P. Indyk and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the 20th annual symposium on Computational geometry (SoGG'04), pp. 253-262, Brooklyn, New York, USA, June 2004.
  11. G. A. Di Lucca, M. Di Penta, and A. R. Fasolino and P. Granato. Clone Analysis in the Web Era: an Approach to Identify Cloned Web Pages. In Proceedings of the 7th IEEE Workshop on Empirical Studies of Software Maintenance (WESS'99), pp. 107-113, Florence, Italy, November 2001.
  12. St´ephane Ducasse, Oscar Nierstrasz, and Matthias Rieger. Lightweight detection of duplicated codea language-independent approach. Technical report, University of Bern, Institute of Computer Science and Applied Mathematics, Bern, Switzerland, February 2004.
  13. St´ephane Ducasse, Matthias Rieger, Serge Demeyer. A Language Independent Approach for Detecting Duplicated Code. In Proceedings of the 15th International Conference on Software Maintenance (ICSM'99), pp. 109-118, Oxford, England, September 1999.
  14. Susan T. Dumais. Latent Semantic Indexing (LSI) and TREC-2. In Proceedings of the 2nd Text Retrieval Conference (TREC'94), pp. 105-115, Gaithersburg, Maryland, March 1994.
  15. Williams Evans, and Christopher Fraser. Clone Detection via Structural Abstraction. In Proceedings of the 14th Conference on Reverse Engineering (WCRE'07), Vancouver, BC, Canada, October 2007(to appear, available as Technical Report since August 2005).
  16. Keith Gallagher, Lucas Layman. Are Decomposition Slices Clones? In Proceedings of the 11th IEEE International Workshop on Program Comprehension (IWPC'03), pp. 251-256 Portland, Oregon, USA, May 2003.
  17. Kevin Greenan. Method-Level Code Clone Detection on Transformed Abstract Syntax Trees using Sequence Matching Algorithms. Student Report, University of California -Santa Cruz, Winter 2005.
  18. Lingxiao Jiang, GhassanMisherghi, Zhendong Su, and Stephane Glondu. DECKARD: Scalable and Accurate Tree-based Detection of Code Clones. In Proceedings of the 29th International Conference on Software Engineering (ICSE'07), pp. 96-105, Minnesota, USA, May 2007.
  19. J Howard Johnson. Identifying Redundancy in Source Code Using Fingerprints. In Proceeding of the 1993 Conference of the Centre for Advanced Studies Conference (CASCON'93), pp. 171-183, Toronto, Canada, October 1993.
  20. Raghavan Komondoor and Susan Horwitz. Using Slicing to Identify Duplication in Source Code. In Proceedings of the 8th International Symposium on Static Analysis (SAS'01), Vol. LNCS 2126, pp. 40-56, Paris, France, July 2001.
  21. Raghavan Komondoor. Automated Duplicated-Code Detection and Procedure Extraction. Ph. D. Thesis, 2003.
  22. K. Kontogiannis, M. Galler, and R. DeMori. Detecting code similarity using patterns. In Working Notes of 3rd Workshop on AI and Software Engineering, 6pp. , Montreal, Canada, August 1995.
  23. Rainer Koschke, Raimar Falke and Pierre Frenzel. Clone Detection Using Abstract Syntax Suffix Trees. In Proceedings of the 13th Working Conference on Reverse Engineering (WCRE'06), pp. 253-262, Benevento, Italy, October 2006.
  24. Jens Krinke. Identifying Similar Code with Program Dependence Graphs. In Proceedings of the 8th Working Conference on Reverse Engineering (WCRE'01), pp. 301-309, Stuttgart, Germany, October 2001.
  25. Chao Liu, Chen Chen, Jiawei Han and Philip S. Yu. GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis. In the Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06), pp. 872-881, Philadelphia, USA, August 2006.
  26. Zhenmin Li, Shan Lu, Suvda Myagmar, Yuanyuan Zhou. CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code. In Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI'04), pp. 289-302, San Francisco, CA, USA, December 2004.
  27. Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code. In IEEE Transactions on Software Engineering, Vol. 32(3): 176-192, March 2006.
  28. Andrian Marcus and Jonathan I. Maletic. Identification of high-level concept clones in source code. In Proceedings of the 16th IEEE International Conference on Automated Software Engineering (ASE'01), pp. 107-114, San Diego, CA, USA, November 2001.
  29. Jean Mayrand, Claude Leblanc, Ettore Merlo. Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics. In Proceedings of the 12th International Conference on Software Maintenance (ICSM'96), pp. 244-253, Monterey, CA, USA, November 1996.
  30. J. -F. Patenaude, E. Merlo, M. Dagenais, and B. Lague. Extending software quality assessment techniques to java systems. In Proceedings of the 7th International Workshop on Program Comprehension (IWPC'99), pp. 4956, Pittsburgh, PA, USA, May 1999.
  31. Aoun Raza, Gunther Vogel, Erhard Pl¨odereder. Bauhaus–A Tool Suite for Program Analysis and Reverse Engineering. In Proceedings of the 11th Ada-Europe International Conference on Reliable Software Technologies , LNCS 4006, pp. 71-82, Porto, Portugal, June 2006.
  32. Robert Tairas, Jeff Gray. Phoenix-Based Clone Detection Using Suffix Trees. In Proceedings of the 44th annual Southeast regional conference (ACM-SE'06), pp. 679-684, Melbourne, Florida, USA, March 2006.
  33. V. Wahler, D. Seipel, Jurgen Wolff von Gudenberg, and G. Fischer. Clone detection in source code by frequent itemset techniques. In Proceedings of the 4th IEEE International Workshop Source Code Analysis and Manipulation (SCAM'04), pp. 128135, Chicago, IL, USA, September 2004.
  34. Wuu Yang. Identifying syntactic differences between two programs. In Software Practice and Experience, 21(7):739755, July 1991.
  35. S. W. L. Yip and T. Lam. A software maintenance survey. In Proc. of the 1st Asia-Pacific Software Engineering Conference, pages 70–79, Dec 1994.
  36. ISO/IEC. Software Engineering - Software Maintenance. ISO/IEC 14764, 1999.
  37. L. Arthur. Software Evolution: The Software Maintenance Challenge. Wiley 1988.
  38. Duala-Ekoko, Ekwa, and Martin P. Robillard. "Clonetracker: tool support for code clone management. " Proceedings of the 30th international conference on Software engineering. ACM, 2008.
  39. Sonam Gupta, Dr. P. C. Gupta, " Clones : A Survey", International Journal of Computer Scinece and Technology Vol. 3, Issue 3, July - Sept 2012.
  40. Juergens, Elmar, Florian Deissenboeck, and Benjamin Hummel. "CloneDetective-A workbench for clone detection research. " Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, 2009.
  41. Kawaguchi, Shinji, et al. "Shinobi: A tool for automatic code clone detection in the ide. " Reverse Engineering, 2009. WCRE'09. 16th Working Conference on. IEEE, 2009.
  42. De Wit, Michiel, Andy Zaidman, and Arie Van Deursen. "Managing code clones using dynamic change tracking and resolution. " Software Maintenance, 2009. ICSM 2009. IEEE International Conference on. IEEE, 2009.
  43. Nguyen, Hoan Anh, et al. "Clone management for evolving software. " Software Engineering, IEEE Transactions on 38. 5 (2012): 1008-1026.
Index Terms

Computer Science
Information Sciences

Keywords

Clones maintenance Program dependence graph tree-based approach false positives and hybrid approach