Running the headless mode within Eclipse

This is an Eclipse plug-in that allows running JDeodorant for identifying refactoring opportunities and applying them in the batch mode.

Running the headless mode within Eclipse

You can run this application from within Eclipse. Please follow these steps:

Download (or clone) jdeodorant-commandline and JDeodorant plug-in and import them as existing projects into your Eclipse workspace.
Right-click on the JDeodorant-Commandline project and select Run As > Run Configurations...
Click on Eclipse Application and then on the New launch configuration button. Give a name to the newly-created launch configuration.
In the Main tab:
- In the Workspace Data, setup the Location to point to the workspace containing the projects that you want to analyze in the headless mode. The projects which are going to be analyzed will be opened in this workspace. There are two options to open Java projects (that you are going to analyze) in the workspace:
  - The workspace directory is created by Eclipse. In this case, it can be created by clicking on File > Switch Workspace and specifying a new workspace directory, and then creating a new project (or importing the existing one) to Eclipse. You can import multiple projects that you want to analyze. After you are done, you should switch back to the original workspace where JDeodorant and jdeodorant-commandline plug-ins are imported.
  - You can ask the tool to try importing an existing Eclipse project automatically. In this case, the workspace is created in the given path by the tool, and the project is imported to it. You'll need to use the -pd switch to specify the path to the .project file of the project (See the table below).
  Note that, in any case, Eclipse project files should exist for the Java project that you want to analyze.
- In the Program to Run select to Run an application and from the drop-down list select ca.concordia.jdeodorant.eclipse.commandline.application.
In the Arguments tab specify the Program arguments (refer to the following table).
Next, specify the VM arguments as -Xms128m -Xmx4096m -XX:PermSize=128m (you can increase the Xmx value, if more memory is available).
In the Plug-ins tab first select plug-ins selected below only in the Launch with: drop-down list. Then select ca.concordia.jdeodorant.eclipse.commandline (1.0.0.qualifier) and click on Add Required Plug-ins button.
Apply the changes in order to save the new Launch Configuration. Click Run to test whether the headless plug-in works properly. If you are getting BundleExceptions, go back to the Plug-ins tab (step 7) and select Launch with: all workspace and enabled target plug-ins. Apply the changes and Run again the headless plug-in.

Running as a standalone command-line application

We have provided the necessary means for generating an Eclipse product that can be run from the OS command-line as a standalone executable, without the need for opening Eclipse for running. This is particularly useful if, for instance, one needs to integrate JDeodorant in their current development workflow (e.g., using continuous integration).

The Eclipse product is an executable file along with the necessary plug-in dependencies. The entire package can be generated by Eclipse, one for each platform. We have tested the product on Windows and Mac.

To generate the executable for your target platform, follow these steps:

Download (or clone) jdeodorant-commandline and JDeodorant plug-in and import them as existing projects into your Eclipse workspace (You will need Eclipse only to generate the Eclipse product, which runs from the OS commandline).
In the commandline project, double click on ProductConfiguration.product. The Product Configuration Editor should be opened (If not, you might be missing necessary plug-ins installed on your Eclipse. We tested on Eclipse IDE for Java EE Developers).
If you need to configure the generated product, you can use the Configuration and Launching tabs, which allow changing parameters for the generated product for different platforms. For instance, you might want to change the eclipse.ini file that the target product will use, or provide additional VM arguments.
From the Overview tab, under the Exporting section, choose Eclipse Product Export Wizard.
In the shown wizard, /JDeodorant-Commandline/ProductConfiguration.product should be selected as Configuration. Specify a directory under the Destination section, and click Finish.
A folder containing the final Eclipse Product will be created. Look for the file \eclipse\eclipse.exe or \MacOS\eclipse, which is the executable for the product.
Open a command line, switch to the folder containing the executable file (found in the previous step) and run the product's executable. You should provide necessary arguments, as mentioned in the following table. For instance, you can run (on Windows):

eclipse.exe -pd "TestProject/.project" -x "clones.xls" -m PARSE_AND_ANALYZE ...

Command-line arguments

These arguments can be passed in step 5 (headless mode within Eclipse) or step 7 (standalone mode).

Long option	Short option	Arguments	Description
--help	-?		Displays arguments and their explanations
--mode	-m	`analyze_existing` `parse_and_analyze` `parse`	Mode of operation. See below for more information
--project	-p	{project name}	Name of the project which currently exists in the Eclipse workspace
--project-description	-pd	{.project file}	Alternative to `-p`; Path to the `.project` file of the eclipse project to be imported to the workspace
--excelfile	-x	{path/to/the/xls/file}	Path to the input (output, in the `PARSE` mode) .xls file
--tool	-t	`clone_tool_ccfinder` `clone_tool_clonedr` `clone_cool_conqat` `clone_tool_deckard` `clone_tool_nicad`	Specifies the clone detection tool
--tooloutputfile	-i	{path/to/the/input/file}	Path to the main output file of the clone detection tool
--extra-args	-xargs	{arg1, arg2, ...}	Comma separated list of extra arguments which are needed in case if we use specific clone detection tools. See below for more information.
--row-start-from	-r	{row}	Specifies the row number (starting from 2, row 1 is the header) of which the tool must start the analysis.
--append-results	-a		Specifies whether the existing outputs (Excel file, CSV files) must be appended by new results or they must be overridden.
--skip-groups	-s	{group_id1, group_id2, ...}	A comma separated list of clone group IDs to be skipped from the analysis.
--test-packages	-testpkgs	{group_id1, group_id2, ...}	A comma separated list of the fully-qualified names of the packages containing test code.
--test-source-folders	-testsrcs	{folder1,folder2,...}	A comma separated list of the source folder names containing test code. This is similar to the previous argument.
--run-tests	-rt		Run tests after applying each refactoring.
--log-to-file	-l		Create a log file from console output.
--group-ids	-g	{id1, id2, id3, ...}	A comma-separated list of clone group IDs to be analyzed. Other clone groups in the file will be skipped
--debugging-enabled	-de		Prevent Eclipse command-line tool to cancel jobs queued in Eclipse JobManager such as workbench job, etc., so that debugging is possible in Eclipse
--mail-server-ip	-msrvr	{Mail server address} 127.0.0.1	Email server for sending emails after analysis finished
--mail-server-port	-mport	{Mail server port} 25	Email server port, see previous option
--mail-server-security-type	-msectype	`NONE` `SSL` `STARTLS`	Security type for mail server
--mail-server-authenticated	-mauth		Is SMTP server authenticated
--mail-server-user-name	-muser	{Mail server user name}	SMTP user name
--mail-server-password	-mpass	{Mail server password}	SMTP password
--email-addresses	-em	{email1, email2, ...}	A comma-separated list of email addresses to which the analysis notifications should be sent

Note: The bold-faced options are mandatory. Italic arguments are default values.

Mode of Operation

The headless application works in three different modes. These modes are explained in the following table. For running the tool in each of these modes, use appropriate value for --mode (or -m) argument.

Value for `--mode` argument	Description
`PARSE`	In this mode, the output file of a clone detection tool will be parsed to an Excel file. You mist give the path to the Excel file using `-excelfile` (or `-x`) argument. You must also provide the name of the clone detection tool (using the `--tool` argument), the path to the input file (the output of clone detection tool, using `-i` argument), and for some specific clone detection tools, extra argument (using `--xargs`). See below for more info.
`ANALYZE_EXISTING`	In this mode, the tool analyzes an existing Excel file. Again, the path to the Excel file must be given using `-excelfile` (or `-x`) argument. The results of the analysis will be written in the same folder as the input Excel file.
`PARSE_AND_ANALYZE`	This mode first parses the output of the clone detection tool, and then analyzes the parsed Excel file. All the arguments in the `PARSE` mode must be also provided in this mode.

The input (and output) Excel files

The input Excel file must be in Excel 97-2003 (.xls) format. Please note that, the tool cannot handle .xlsx files. The first row of the Excel file is used as header row. For the analysis, the input Excel file must contain the information for some of the columns, while for other columns, the cells will be filled during the analysis.

In the Excel file, each row is for one clone. Each clone is a code fragment which is detected to be duplicated in another part of the system. Several clones in the consecutive rows belong to one clone group. Hence, each possible pair of clones inside a clone group are code fragments that are duplicated. The row corresponding to the first clone of every clone group contains some information about the clone group, including values for Clone Group Size, Clone Group Info and Connected columns.

Column	Description
Clone Group ID	An integer assigned to every clone group. For all the clones inside one clone group, the value of this cell is similar, which is the ID of the clone group to which these clones belong.
Source Folder	The source folder of the class file to which this clone belongs.
Package	Fully qualified path to the package of the class file to which this clone belongs.
Class	Name of the class file to which this clone belongs.
Method	Name of the method in which this clone exists. Please note that, currently there is no support for the clones outside of the boundaries of methods.
Method Signature	Signature of the method in which this clone exists, in the Bytecode format.
Start Line, End Line, Start Offset, End Offset	Starting and ending lines and offsets of the clone fragment.
#PDG Nodes	Number of PDG nodes in the method in which this clone exists. This column will be filled after analysis on this clone is done.
#Statements	Number of statements in the clone fragment that is reported to be a clone. This column will be filled after analysis on this clone is done.
Line coverage	Percentage of the number of lines of code fragment covered by unit tests.
Clone Group Size	Number of the clones in the clone group. This value only comes in the first row of the clone group.
Clone Group Info	Type of the clone group. It might be Repeated when the entire clone group is repeated, or Subclone when the clones in this clone group are sub-clones or super-clones of clones in another clone group. In these two cases, our tool will skip the clone group for analysis.
Connected	If the value of the previous cell is Subclone, this cell contains the clone group ID of the clone group of which this clone group is a sub-clone (or super-clone).
Clone Pair Location	Location of the clones in the clone group. Clones could be in the same in the same method, in the same class, or in different classes.
#Refactorable Pairs	Number of refactorable pairs in the clone group, which is calculated after the analysis.
Details	Each pair of clones in every clone group is analyzed by the tool. When the analysis finished, in this column, and the following columns in the same row, hyperlinks to the HTML reports of the analysis of the clone pair corresponding to this row and all other clones in the same clone group are given. The name of the hyperlink is in the format `{clone group ID}-{first clone number}-{second clone number}`. If the background color for a cell is `green`, it means that the clone pair corresponding to this cell is refactorable, if it is `red`, it means that the clone pair is not refactorable. A `white` background color shows that the clone is not analyzed. This happens when: A clone is a class-level clone, meaning that the clone that is reported by the clone detection tool goes beyond the boundaries of a method, or A clone is a repeated clone, or User has marked the clone group corresponding to this clone to be skipped (using `-skip-groups` (`-s`), or No method was found in the given code region that was reported by the clone detection tool, or No common nesting structure was found for the clone pair.

A sample empty Excel file is provided here.

Using the output of clone detection tools

The output of a clone detection tool must be first converted to the desired Excel file. For convenience, we have provided parsers for the popular clone detection tools, as an internal feature in the command-line tool.

When the tool is executed in the PARSE or PARSE_AND_ANALYZE modes, user has to provide the tool with the output file of the clone detection tool, using --tooloutputfile (-i) argument. Also, the name of the clone detector must be specified using --tool (-t) argument. For example, the following arguments can be used to parse and analyze an output from CCFinder for project Apache Ant:

-p apache-ant-1.7.0
-x "apache-ant-1.7.0-ccfinder.xls"
-m PARSE_AND_ANALYZE
-t CLONE_TOOL_CCFINDER
-i "ccfinder.ccfxd"
-xargs "C:\Results\CCFinder\apache-ant-1.7.0\src\.ccfxprepdir",""
-testsrcs "src/tests/junit"

For the moment the tool supports five different clone detection tools, as shown in the table below. The value for --extra-args (-xargs) argument depends on the tool, and provides necessary information for parsing the input file. For instance, in this example we have provided two additional strings through this argument, separated by comma.

Clone Detection Tool	`--tool` (`-t`)	`--extra-args-` (`-xargs`)
CCFinder	CLONE_TOOL_CCFINDER	Path to the special folder that CCFinder generates during analysis (named `ccfinder.ccfxd`). This folder is located in the examined directory. [optional] Path to the src folder of the project.
Deckard	CLONE_TOOL_DECKARD	Not needed
ConQAT	CLONE_TOOL_CONQAT	Not needed
CloneDR	CLONE_TOOL_CLONEDR	Path to the folder where the analyzed project was initially located (This is important because these tools save absolute paths to the analyzed Java files)
Nicad	CLONE_TOOL_NICAD

Output of the commandline tool

The commandline tool generates an Excel file, with the same name (appended by -analyze) and in the same path as the input Excel file which contains the results of the analysis. The HTML reports of the analysis can be found in a folder named html.reports which is located in the same folder as the input and output Excel files.

When the tool is used to parse the output of a clone detection tool, a folder named code-fragments in the same path as the input and output Excel files is created, which contains the real code fragments as reported by the clone detection tool. The names of these files are in the format {ID}-{CLONE_NUMBER}, where {ID}' is the ID of the corresponding clone group to which this clone belongs, and {CLONE_NUMBER}` is the clone's index in current clone group. This helps in mapping Excel file rows (clones) to these files.

For those who are interested in performing statistical analysis using tools such as R, Matlab, etc, the tool generates CSV files containing information gathered during analysis. Three CSV files are created, as explaned below. Please note that, the separator in these files is pipe ("|") character. The first row of these files is header.

{INPUT_EXCEL_FILE_NAME}.report.csv

Contains general information about the refactorability analysis results. Every row in these files corresponds to a single clone pair. The columns in the order they appear in the CSV files are:

Column Name	Description
GroupID	ID of the clone group of this clone pair
PairID	ID of the clone pair, created by appending clone indices with a hyphen between them
ClonePairLocation	Identifies the relative location of clones. One of these values: 0 Clones are in the same method, 1 Clones are declared in the same class, 2 Clones are in the same java file, 3 Clones are in different classes having the same super class, 4 Clones are in different classes.
IsTestCode	Identifies whether the clone is test code or not. It may have one of these values: 0 Both clones are production code, 1 First clone is test code and second one is production code, 2 First clone is production code and second one is test code, 3 Both clones are test code.
#StatementsInCloneFragment1 & #StatementsInCloneFragment2	Number of statements (AST nodes) in clones that were analyzed. Note that, this might be different from what was reported by the clone detection tool, as tool applies filtering on the AST nodes, as discussed in the paper.
#NodeComparisons	Number of node comparisons that were done to assess the refactorability of the clone
#PDGNodesInMethod1 & #PDGNodesInMethod2	Number of PDG nodes in the analyzed method bodies
#RefactorableSubtrees	Number of subtrees in the analyzed methods that can be refactored
SubtreeMatchingWallNanoTime	Time spent in finding the common nesting structures between the compared methods (in Nano seconds)
Status	Identifies the status of the analysis, one of the following values: 0 Happens when: At least one of the ASTs didn't have any nodes, Tool couldn't find either first or second methods in the reported regions, Tool could not get the body of either first or second methods for any reason. 1 The bottom-up subtree matching didn't find any common nesting structure, so mapping phase didn't happen, 2 Analysis was done normally.

{INPUT_EXCEL_FILE_NAME}.trees.csv

For every clone pair, more than one subtree may be found which could be refactorable or not. This file contains the information about every subtree. The columns in the order they appear in the CSV files are:

Column Name	Description
GroupID & PairID	Used to identify to which clone pair this subtree belongs
TreeID	Index of the subtree for this clone pair
CloneType	Type of the clone which could be 1, 2, 3 or Unknown (4)
PDGMappingWallNanoTime	Time spent to map PDG nodes,
#PreconditionViolations	Number of Precondition Violations,
#MappedStatements	Number of mapped statements. If this value is more than zero and also #PreconditionViolations is zero, the subtree is refactorable,
#UnMappedStatements1 & #UnMappedStatements2	Number of unmapped statements in the first and second subtree,
#Differences	Number of differences in the mapped statements.
RefactoringWasOK	Was refactoring successful?
TestsFailedAfterRefactoring	Were any tests failed after refactoring?
HadCompileErrorsAfterRefactoring	Did we have compile errors after refactoring?
CloneRefactoringType	Type of the refactoring. One of the following values: 0: Extract local method 1: Pull up to existing superclass 2: Pull up to new intermediate superclass extending common internal superclass 3: Pull up to new intermediate superclass implementing common internal interface 4: Pull up to new superclass extending common external superclass 5: Pull up to new superclass implementing common external interface 6: Pull up to new superclass extending object 7: Extract static method to new utility class 8: Infeasible
IsTemplateMethodApplicable	Is template method refactoring applicable for this refactoring?

{INPUT_EXCEL_FILE_NAME}.precondviolations.csv

This file contains information about precondition violations for each subtree, if the subtree was not found to be refactorable, using the traditional . The columns in the order they appear in the CSV files are:

Column Name

Description

GroupID, PairID & TreeID

Identifies to which subtree this precondition violation belong

PreconditionViolationType

Type of the precondition violation, one of the following values:

0: Expression difference cannot be parameterized
1: Expression difference is field update
2: Expression difference is void method call
3: Expression difference is method call throwing exception within matched try block
4: Infeasible unification due to variable type mismatch
5: Infeasible unification due to missing members in the common superclass
6: Infeasible unification due to passed argument type mismatch
7: Unmatched statement cannot be moved before or after the extracted code
8: Unmatched statement cannot be moved before the extracted code due to control dependence
9: Unmatched break statement
10: Unmatched continue statement
11: Unmatched return statement
12: Unmatched throw statement
13: Unmatched exception throwing statement nested within matched try block
14: Multiple returned variables
15: Unequal number of returned variables
16: Single returned variable with different types
17: Break statement without loop
18: Continue statement without loop
19: Conditional return statement
20: Switch case statement without switch
21: Super constructor invocation statement
22: Super method invocation statement
23: Multiple unmatched statements update the same variable
24: Infeasible refactoring due to uncommon superclass
25: Infeasible refactoring due to zero matched statements
26: Not all possible execution flows end in return

{INPUT_EXCEL_FILE_NAME}.compileerrors.csv

This file contains compile errors, after refactoring is done on each subtree. The file has the following columns:

Column Name	Description
GroupID, PairID & TreeID	Identifies to which subtree this compile error belongs
FileHavingCompileError	Relative path to the file that has compile errors after refactoring

{INPUT_EXCEL_FILE_NAME}.testdifferences.csv

This file contains the tests are failed, after refactoring is done on each subtree. The file has the following columns:

Column Name	Description
GroupID, PairID & TreeID	Identifies for which subtree this test difference exists
TestDifference	Name of the test case that is failing after refactoring

{INPUT_EXCEL_FILE_NAME}.exprgapsinfo.csv

This file contains information about the expression differences between the clone pairs for each subtree; i.e., the differences which lead to lambda expressions that has a single expression as its body. The file has the following columns:

Column Name	Description
GroupID, PairID & TreeID	Identifies to which subtree this expression gap belongs
#Params	Number of parameters for the created lambda expression
#ReturnType	Return type of the lambda expression
#ThrownExceptions	Number of the thrown exceptions by the lambda expression
#NonEffectiveFinalVars	Number of non-effectively final variables for which JDeodorant has to make final variables (so that they can be used inside the lambda expression)

{INPUT_EXCEL_FILE_NAME}.blockgapsinfo.csv

This file contains, for each subtree, information about the block gaps, i.e., the gaps for which JDeodorant has to make lambda expressions with a block of statements as their body. The file has the same columns as {INPUT_EXCEL_FILE_NAME}.exprgapsinfo.csv; in addition, it contains two additional columns, namely #Statements1 and #Statements2, which include the number of statements inside the body of the created lambda expressions for the first and second clone pairs, respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.settings		.settings
META-INF		META-INF
lib		lib
res		res
src/ca/concordia/jdeodorant/eclipse/commandline		src/ca/concordia/jdeodorant/eclipse/commandline
.classpath		.classpath
.gitignore		.gitignore
.project		.project
ProductConfiguration.product		ProductConfiguration.product
README.md		README.md
build.properties		build.properties
plugin.xml		plugin.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running the headless mode within Eclipse

Running as a standalone command-line application

Command-line arguments

Mode of Operation

The input (and output) Excel files

Using the output of clone detection tools

Output of the commandline tool

{INPUT_EXCEL_FILE_NAME}.report.csv

{INPUT_EXCEL_FILE_NAME}.trees.csv

{INPUT_EXCEL_FILE_NAME}.precondviolations.csv

{INPUT_EXCEL_FILE_NAME}.compileerrors.csv

{INPUT_EXCEL_FILE_NAME}.testdifferences.csv

{INPUT_EXCEL_FILE_NAME}.exprgapsinfo.csv

{INPUT_EXCEL_FILE_NAME}.blockgapsinfo.csv

About

Releases

Packages

Contributors 3

Languages

tsantalis/jdeodorant-commandline

Folders and files

Latest commit

History

Repository files navigation

Running the headless mode within Eclipse

Running as a standalone command-line application

Command-line arguments

Mode of Operation

The input (and output) Excel files

Using the output of clone detection tools

Output of the commandline tool

{INPUT_EXCEL_FILE_NAME}.report.csv

{INPUT_EXCEL_FILE_NAME}.trees.csv

{INPUT_EXCEL_FILE_NAME}.precondviolations.csv

{INPUT_EXCEL_FILE_NAME}.compileerrors.csv

{INPUT_EXCEL_FILE_NAME}.testdifferences.csv

{INPUT_EXCEL_FILE_NAME}.exprgapsinfo.csv

{INPUT_EXCEL_FILE_NAME}.blockgapsinfo.csv

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages