Measure quality metrics on file triplets #520

vmarkovtsev · 2019-01-10T11:19:07Z

Given a modified file and a ground truth fixed file, we apply the model, generate a predicted file, throw away everything except those three files and measure quality metrics.

EgorBu · 2019-01-11T12:47:28Z

Some discussion is required.

A bit of context:
We have 3 types of reports:

quality report
- measure how well analyzer can reconstruct initial source code (minus of this approach - if analyzer does nothing - it's a perfect reconstruction). Could be used as some kind of measurment of source code consistency.
quality report noisy
- Evaluate how well a given model is able to fix style mistakes randomly added in a repository. & noise is introduced by hands so far
- count how many errors were corrected by analyzer
- precision-recall curve with logic that should help to estimate cut-off for rules
quality report smoke SmokeEvalFormatAnalyzer consist of 2 parts
- generate_smoke.py - module to generate several types of mutations in JS files using regexp
- evaluate_smoke.py - misdetection, undetected, detected_wrong_fix, detected_correct_fix

Quality report noisy requires usage of model (to check quality with different number of rules and so on) - so it looks like that it can't be used with triplets.
Opposite situation with quality report smoke & quality report - they need the model only to generate new contents -> so it can be refactored to use file triplets (withFileFix).

TODO list may look like this:

refactor quality report smoke to utilize ReportAnalyzer(FormatAnalyzerSpy)
1.1) OPTIONAL: utilize tokenizer and checking for UAST breaking changes to generate new mutants (example) instead of regexp - variety of violations will be much bigger and number of mutants will be equal to n1 * n2 * ... * nk (where n is number of mutants that don't change UAST, the index shows the place - every Noop & every space|newline|tab|quote) - and it's huge.
add noise introduction step or use [ptr_ground_truth, ptr_mutant] instead of 1 ptr right now

# noise could be introduced before providing data_service
for file_fix in self.generate_file_fixes(data_service, changes):
    filepath = file_fix.head_file.path

WDYT @vmarkovtsev , @zurk ?

zurk · 2019-01-17T13:27:29Z

Quality report noisy requires usage of model ... so it looks like that it can't be used with triplets.

I think the noisy report should not require a model. Because all we need are rules confidence and we have them in the model. To build the curves we just should use many file triplets instead of one. It is the easiest solution and can be time-consuming but as soon as we do not run it frequently should be fine I think.
And in case some information is missing in the model we can consider adding it.

refactor quality report smoke to utilize ReportAnalyzer(FormatAnalyzerSpy).

Yeah, I think this should be done. One issue with ReportAnalyzer that I discover is that in case you want an arbitrary number of reports, you should modify a lot of code. So, first, I'd prefer to polish and make more usable ReportAnalyzer itself.

1.1) OPTIONAL: utilize tokenizer and checking for UAST breaking changes to generate new mutants

The idea of a smoke dataset is to have a dataset of styles in the first place. if you do random style mutations you lose an overall idea. Random style mutations are closer to the noisy dataset (even by name) or can be used as an independent one. I see big potential for this idea from the early beginning but let's not mix it with the smoke dataset.

I suggest separating this issue (about creating a uniform tool/class to measure quality metrics on file triplets) from noise introduction step because they are about different things. Noise introduction step is a new feature and we should avoid to add more functionality during the refactoring stage.
Let's create another issue for it and focus on the main problem here.

vmarkovtsev · 2019-01-17T14:01:01Z

(no time to read for now)

If this grows into a huge task, we should do it after resolving the other, smaller issues which were planned.

vmarkovtsev added enhancement New feature or request refactor labels Jan 10, 2019

vmarkovtsev added this to the Refactoring January 2019 milestone Jan 10, 2019

vmarkovtsev assigned EgorBu Jan 10, 2019

vmarkovtsev added the large Large size label Jan 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measure quality metrics on file triplets #520

Measure quality metrics on file triplets #520

vmarkovtsev commented Jan 10, 2019

EgorBu commented Jan 11, 2019 •

edited

Loading

zurk commented Jan 17, 2019

vmarkovtsev commented Jan 17, 2019 •

edited

Loading

Measure quality metrics on file triplets #520

Measure quality metrics on file triplets #520

Comments

vmarkovtsev commented Jan 10, 2019

EgorBu commented Jan 11, 2019 • edited Loading

zurk commented Jan 17, 2019

vmarkovtsev commented Jan 17, 2019 • edited Loading

EgorBu commented Jan 11, 2019 •

edited

Loading

vmarkovtsev commented Jan 17, 2019 •

edited

Loading