-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify storage readers #8422
Simplify storage readers #8422
Conversation
4fb53b4
to
31a6b7a
Compare
* of a package using the same version name are used (e.g. multiple scans of a "1.0-SNAPSHOT" version during | ||
* development). | ||
*/ | ||
fun read(packages: Collection<Package>, scannerMatcher: ScannerMatcher): Result<Map<Identifier, List<ScanResult>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any comment on why this function is actually unused? I mean, as we're pretty much always interested in results of multiple packages, it would seem more efficient to only use this function in order to not apply the scanner matcher for every package we request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially did not use it in the new scanner implementation for simplicity. Also, the new scanner converts package scan results into provenance scan results and reuses them if another package has the same provenance, this approach conflicts with downloading results in bulk. And since the switch to the new scanner two years ago the downloading of stored scan results never seemed to be a performance problem so no one looked into optimizing the performance, e.g. by taking this function into use again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details. I'll leave it up to you whether you think it's worth putting some of that into the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some of it to the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI @mnonnenmacher, we might have a use-case to add back the API for reading multiple packages at once: Is there is a general problem with some scanner backend, there is no point in continuing reading from it if the first read already fails, and we'd like to be able to fail all of the reads at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking it back into use might require a bigger refactoring of the Scanner
, losing some of the performance optimizations along the way, e.g. not downloading scan results multiple times for packages that share the same provenance.
If the goal is to disable a storage if a certain error occurred, that could be implemented also without introducing such a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that could be implemented also without introducing such a function.
What do you have in mind @mnonnenmacher? Like a special Result
value / exception being throw that signals the caller that no further packages should be tried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like that. I think the tricky part would be to decide which kind of error should trigger this behavior. In most cases it might be better to fix the storage than to rescan everything, but that depends on the specific setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tricky part would be to decide which kind of error should trigger this behavior.
I believe this should simply be signaled via a new special exception that implementations may throw...
Remove obsolete docs from the companion object of `ScanResultsStorage`. This is a fixup for 1ad5357. Signed-off-by: Martin Nonnenmacher <[email protected]>
Signed-off-by: Martin Nonnenmacher <[email protected]>
Merge the read functions of the `PackageBasedScanStorageReader` into one by making the `scannerMatcher` argument nullable. This makes the `read` function without `scannerMatcher` argument redundant. The same change is applied to the `ScanResultsStorage` class where the `read` function implements the filtering, so that classes implementing the `readInternal` function only have to implement the filtering if they can do so more efficiently. Signed-off-by: Martin Nonnenmacher <[email protected]>
31a6b7a
to
af4a747
Compare
Remove the unused `read` function that takes multiple packages as argument. The function is not used anymore since c71628d. The current scanner implementation does other optimizations when using package based storages to avoid downloading multiple results for the same provenance, so it is unlikely that the function in the current form would be taken into use again. Signed-off-by: Martin Nonnenmacher <[email protected]>
…reader Merge the read functions of the `ProvenanceBasedScanStorageReader` into one by making the `scannerMatcher` argument nullable. This makes the `read` function without `scannerMatcher` argument redundant. Signed-off-by: Martin Nonnenmacher <[email protected]>
The `ScanResultsStorage.EMPTY_RESULT` constant was only used in a single place. Signed-off-by: Martin Nonnenmacher <[email protected]>
Move the abstract base class for package based scan storages to the storage package, next to the classes that inherit from it. Signed-off-by: Martin Nonnenmacher <[email protected]>
Rename `ScanResultsStorage` to `AbstractPackageBasedScanStorage`. The old name was introduced before the new scan storage interface, so replace it with a name that better the purpose of the class now. While at it, also fix some comment to reference the base interfaces instead. Signed-off-by: Martin Nonnenmacher <[email protected]>
Rename the abstract storage test classes to better match the names of the related storage interfaces. Signed-off-by: Martin Nonnenmacher <[email protected]>
Rename the package based file and Postgres storage to match the names of their provenance based counterparts. Signed-off-by: Martin Nonnenmacher <[email protected]>
af4a747
to
293a806
Compare
Simplify the storage reader interfaces by merging the read functions, and apply several refactorings as clean up after switching to the new scanner implementation a while ago. This is a preparation for turning scan storages into plugins. See the commit messages for details.