Skip to content

Script matching comments of a git repository according to a predefined set of patterns.

License

Notifications You must be signed in to change notification settings

robertoverdecchia/code-comment-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code comment filter

Script filtering comments present in a git repository source code according to a predefined set of patterns.

Dependencies

This script relies on the following packages:

  • GitPython==2.1.5
  • comment-parser==1.0.3

To check and install the dependencies simply run the command pip install -r requirements.txt

Usage

From the root directory execute: python parse.py

Input

The script takes as input the file patterns.txt, in which the patterns to be matched are specified.

Output

The output of the script is stored in the file output_parsing.tsv, which contains the source code comments matching the predefined patterns. The three columns of the output file are:

  • File name: Location of the souce code file in which the matched comment appears
  • Keyword: Pattern keyword(s) contained in the matched comment
  • Comment: Content of the matched source code comment

Notes

  • The git repository to be analyzed is currently hardcoded in the script. Change the variable git_repository_url to utilize a different repository.

  • The language of the repository has to be specified in the MIME type variable MIME. For the mapping of languages to MIME types refer to the documentation of the comment_parser package.

  • Extension type(s) of the files to be considered during the parsing have to be specified in the extension variable extensions

  • Currently supported languages:

    • C
    • C++
    • Go
    • Java
    • Javascript
    • Bash/Sh

Credits and license

Author:

Sample patterns were taken from the dataset of the research "An Exploratory Study on Self-Admitted Technical Debt" by Potdar et. al available here.

This project is licensed under the MIT License - see the file license.txt

About

Script matching comments of a git repository according to a predefined set of patterns.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages