2020 IEEE 14th International Workshop on Software Clones (IWSC)


Anthology ID:
G20-41
Month:
Year:
2020
Address:
Venue:
GWF
SIG:
Publisher:
IEEE
URL:
https://gwf-uwaterloo.github.io/gwf-publications/G20-41
DOI:
Bib Export formats:
BibTeX MODS XML EndNote

pdf bib
SemanticCloneBench: A Semantic Code Clone Benchmark using Crowd-Source Knowledge
Farouq Al-Omari | Chanchal K. Roy | Tonghao Chen

Not only do newly proposed code clone detection techniques, but existing techniques and tools also need to be evaluated and compared. This evaluation process could be done by assessing the reported clones manually or by using benchmarks. The main limitations of available benchmarks include: they are restricted to one programming language; they have a limited number of clone pairs that are confined within the selected system(s); they require manual validation; they do not support all types of code clones. To overcome these limitations, we proposed a methodology to generate a wide range of semantic clone benchmark(s) for different programming languages with minimal human validation. Our technique is based on the knowledge provided by developers who participate in the crowd-sourced information website, Stack Overflow. We applied automatic filtering, selection and validation to the source code in Stack Overflow answers. Finally, we build a semantic code clone benchmark of 4000 clones pairs for the languages Java, C, C# and Python.

pdf bib
Clone Swarm: A Cloud Based Code-Clone Analysis Tool
Venkat Bandi | Chanchal K. Roy | Carl Gutwin

A code clone is defined as a pair of similar code fragments within a software system. While code clones are not always harmful, they can have a detrimental effect on the overall quality of a software system due to the propagation of bugs and other maintenance implications. Because of this, software developers need to analyse the code clones that exist in a software system. However, despite the availability of several clone detection systems, the adoption of such tools outside of the clone community remains low. A possible reason for this is the difficulty and complexity involved in setting up and using these tools. In this paper, we present Clone Swarm, a code clone analytics tool that identifies clones in a project and presents the information in an easily accessible manner. Clone Swarm is publicly available and can mine any open-sourced GIT repository. Clone Swarm internally uses NiCad, a popular clone detection tool in the cloud and lets users interactively explore code clones using a web-based interface at multiple granularity levels (Function and Block level). Clone results are visualized in multiple overviews, all the way from a high-level plot down to an individual line by line comparison view of cloned fragments. Also, to facilitate future research in the area of clone detection and analysis, users can directly download the clone detection results for their projects. Clone Swarm is available online at clone-swarm.usask.ca. The source code for Clone Swarm is freely available under the MIT license on GitHub.

pdf bib
Evaluating Performance of Clone Detection Tools in Detecting Cloned Cochange Candidates
Md Nadim | Manishankar Mondal | Chanchal K. Roy

Code reuse by copying and pasting from one place to another place in a codebase is a very common scenario in software development which is also one of the most typical reasons for introducing code clones. There is a huge availability of tools to detect such cloned fragments and a lot of studies have already been done for efficient clone detection. There are also several studies for evaluating those tools considering their clone detection effectiveness. Unfortunately, we find no study which compares different clone detection tools in the perspective of detecting cloned co-change candidates during software evolution. Detecting cloned co-change candidates is essential for clone tracking. In this study, we wanted to explore this dimension of code clone research. We used six promising clone detection tools to identify cloned and non-cloned co-change candidates from six $C$ and Java-based subject systems and evaluated the performance of those clone detection tools in detecting the cloned co-change fragments. Our findings show that a good clone detector may not perform well in detecting cloned co-change candidates. The amount of unique lines covered by a clone detector and the number of detected clone fragments plays an important role in its performance. The findings of this study can enrich a new dimension of code clone research.