This table compares the features offered by various tools that aid with the detection of source-code plagiarism.
| |
JPlag |
Moss |
Sherlock |
CodeMatch |
Copy/Paste Detector (CPD) |
| System Started from |
1997 |
1994 |
1994 |
2005 |
2003 |
| Language (s) supported |
Java, C#, C, C++, Scheme and natural language text. |
C, C++, Java, C#, Python, Visual Basic, Javascript, FORTRAN, ML, Haskell, Lisp, Scheme, Pascal, Modula2, Ada, Perl, TCL, Matlab, VHDL, Verilog, Spice, MIPS assembly, a8086 assembly, a8086 assembly, MIPS assembly, HCL2. |
Java, C, C++, natural language text. |
BASIC, C, C++, C#, Delphi, Flash ActionScript, Java, JavaScript, MASM, Pascal, Perl, PHP, PowerBuilder, Ruby, SQL, Verilog, VHDL. |
Java, JSP, C, C++, Fortan and PHP code |
| Cost |
Free but user must create an account |
Free but user must create an account |
Free and open sourced |
Commercial tool, free on any code where the total of all files being examined is less than 1 megabyte |
Free and open sourced |
| Service |
Web service |
Internet service |
Standalone application |
Standalone application |
Standalone application |
| Interface |
GUI |
Web interface |
GUI |
GUI |
GUI |
| Requirements |
Web browser, Java Runtime Environment (JRE), Java 1.5 or higher |
A submission script for either UNIX or Windows |
JDK 1.4 or later |
|
JDK 1.4 or later |
| Security |
User id and e-mail needed |
User id and e-mail needed |
Runs locally |
Runs locally |
Runs locally |
| Submission Methods |
Standalone Java software application that can be deployed over the network |
Command line |
Standalone Java application |
Standalone application |
Standalone application |
| Source Data |
Files, or a directory with or without subdirectories |
Files, or a directory with or without subdirectories |
A directory containing subdirectories, each containing one or more zip archives or submissions |
Files, or a directory with or without subdirectories |
Files, or a directory with or without subdirectories |
| Option for excluding files |
Yes |
Yes |
Yes |
Yes |
No |
| Speed |
Fast |
Fast |
Large sets of files will require long amounts of time to analyze |
Large sets of files will require long amounts of time to analyze |
Fast |
Result |
| Result Output |
HTML output for exploring the similar code fragments found |
Both text and HTML reports |
Interactive dialogues for comparing detected similarities and printing reports |
HTML report, and spreadsheet showing statistical information about the files that were analysed |
Text report, xml file, cvs file. |
| Result storage |
Local |
Remote |
Local |
Local |
Local |
| Overview results display method? |
Histogram, statistics about the files that were analysed |
Ordered list |
Matching pair tree |
Ordered list |
Ordered list |
| Display of results |
Powerful graphical interface for presenting results |
Powerful graphical interface for presenting results |
Powerful graphical interface for presenting results |
Graphical user interface presenting the results |
Results displayed in a dialogue |
| Visualisation of results |
Cross linked listings, scroll bars |
|
2/4 scroll bars, simultaneous scrolling |
Table showing the similar lines of code detected among file pairs. |
List of results showing the detected code fragments between file pairs. |
| Visual display of matched file pairs? |
Yes |
Yes |
Yes |
No |
No |
| Display of suspicious code-fragments? |
Highlights the suspicious source-code fragments |
Highlights the suspicious source-code fragments |
Highlights the suspicious source-code fragments |
Returns suspicious lines of code, no option to view entire suspicious code fragments or the entire suspicious files |
Returns duplicate lines of code |
| Metrics produced |
Percentage similarity, token matches |
Percentage similarity, token matches, lines matched |
Percentage similarity, token matches |
Percentage similarity, lines matched |
Token matches, lines matched |
Other |
| Other features? |
|
|
Checkbox to mark the suspicious pairs |
|
|
| Miscellaneous |
Several detection parameter settings, including sensitivity |
Self adjusting |
Several detection parameter settings, including sensitivity |
Some detection parameter settings |
Some detection parameter settings |
| Algorithms |
Greedy String Tiling |
Winnowing algorithm |
Token based matching algorithm for source-code, string matching for natural language texts |
String matching |
Greedy String Tiling |