-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure quality metrics on file triplets #520
Comments
Some discussion is required. A bit of context:
TODO list may look like this:
# noise could be introduced before providing data_service
for file_fix in self.generate_file_fixes(data_service, changes):
filepath = file_fix.head_file.path WDYT @vmarkovtsev , @zurk ? |
I think the noisy report should not require a model. Because all we need are rules confidence and we have them in the model. To build the curves we just should use many file triplets instead of one. It is the easiest solution and can be time-consuming but as soon as we do not run it frequently should be fine I think.
Yeah, I think this should be done. One issue with
The idea of a smoke dataset is to have a dataset of styles in the first place. if you do random style mutations you lose an overall idea. Random style mutations are closer to the noisy dataset (even by name) or can be used as an independent one. I see big potential for this idea from the early beginning but let's not mix it with the smoke dataset.
|
(no time to read for now) If this grows into a huge task, we should do it after resolving the other, smaller issues which were planned. |
Given a modified file and a ground truth fixed file, we apply the model, generate a predicted file, throw away everything except those three files and measure quality metrics.
The text was updated successfully, but these errors were encountered: