-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for the propagation of vulnerabilities within git repositories #2140
base: master
Are you sure you want to change the base?
Conversation
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution, apologies for the slow initial review. My review is still ongoing, here's some initial feedback.
I'd like @oliverchang to also review this, as he's much more intimately familiar with the code being tested here, but I will give this my best shot along.
osv/test_tools/test_repository.py
Outdated
""" Utilitary class to create a test repository for the git tests | ||
""" | ||
|
||
class VulnerabilityType(Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future-proofing and maintenance, could you try and use osv.vulnerability_pb2
and use vulnerability_pb2.Event.DESCRIPTOR.fields_by_name
to introspect into the OSV record definition code, rather than having this standalone definition that could go out of sync with what's defined in
osv.dev/osv/vulnerability.proto
Lines 67 to 77 in ad0eed1
message Event { | |
// The earliest version/commit where this vulnerability | |
// was introduced in. | |
string introduced = 1; | |
// The version/commit that this vulnerability was fixed in. | |
string fixed = 2; | |
// The limit to apply to the range. | |
string limit = 3; | |
// The last affected version. | |
string last_affected = 4; | |
} |
See also https://googleapis.dev/python/protobuf/latest/google/protobuf/descriptor.html
osv/test_tools/test_repository.py
Outdated
|
||
|
||
class TestRepository: | ||
""" Utilitary class to create a test repository for the git tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix spelling mistake
""" Utilitary class to create a test repository for the git tests | |
""" Utility class to create a test repository for the git tests |
osv/test_tools/test_repository.py
Outdated
shutil.rmtree(f"osv/testdata/test_repositories/{name}") | ||
self.repo = pygit2.init_repository( | ||
f"osv/testdata/test_repositories/{name}", bare=False) | ||
#empty initial commit usefull for the creation of the repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix spelling mistake
#empty initial commit usefull for the creation of the repository | |
#empty initial commit useful for the creation of the repository |
osv/test_tools/test_repository.py
Outdated
@@ -0,0 +1,158 @@ | |||
"""test_repository""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please expand this with more details, see https://google.github.io/styleguide/pyguide.html#381-docstrings
I note that this introduces breakage, so that will need to be resolved (perhaps the point of this PR was drive that discussion) before this can be merged successfully... |
## Refactoring – `test_repository.py ` * Make the repo generation cleaner, remove unnecessary branch that were generated * Use protobuf type for events ## Refactoring - `impact_git_test.py` * Move to a template architecture * Remove inconsistent tests * Support cherry picking test Note : `impact.py ` has been modified to handle repository that do not support caching. We are still trying to find a solution to enable the cache for synthetic repository created with pygit2.
Thank you very much for the review, and I apologize for the delay in responding. Since the initial PR commit, we have undergone some refactoring, mainly to make the code easier to understand, implement your proposed changes, and support the cherry-pick parameter to true when using The aim of this pr is indeed to start the discussion about the possible discrepancy in the osv.dev implementation with regard to the "Open Source Vulnerability format" specification. For a research project, we implemented a test suite of the vulnerable commit identification algorithm to test our implementation. We thought that osv.dev would benefit from the same tests. We identified discrepancies while using OSV.dev for a research project and created associated issue :
These test cases were implemented based on our interpretation of the specification. The specification seems to be ambiguous for certain situations, such as merge propagation, multiple ranges ... We hope that the tests we implemented, might help to make the specification less ambiguous. |
The objective of this PR was indeed to drive the discussion about how to test the detection of "affected git commit" and avoid ambiguity on the specification. Do not hesitate if you have any feedback. |
This pull request has not had any activity for 60 days and will be automatically closed in two weeks |
This pull request has not had any activity for 60 days and will be automatically closed in two weeks |
We want to get around to spending some time on this, it's just been a case of insufficient time and higher priorities :-( |
I understand that GIT vulnerability algorithm is not a top priority, as most of OSVs rely on Semantic Versioning or ecosystem specific version format. However, I'm convinced that pushing for a broader adoption of GIT RANGES should enable many analyses that are not possible with Semantic Versionning ranges but also increase the quality of vulnerable commit/release/version identification. For a research project, we developed an algorithm that label the Software Heritage Graph using as input OSV having a git range. |
In relation to some research we are making about the security in the Open Source ecosystem, we investigated how vulnerabilities are propagated in repositories. While reviewing the analysis output of the propagation of vulnerability, we noticed some unexpected behaviors, as per @RomainLefeuvre 's issues.
We made a documentation of the behavior for the test of repositories that you can find below.
We followed what we interpreted from the OSV schema documentation, and we would like to know whether this documentation suits it, and more precisely whether the test cases make sense to you.
This PR contains the implementation of a new test class impact_git_test and a tool to facilitate the creation of git repositories programmatically, the repositories created are dummy repositories, containing empty commits.
The tests implemented test the repoanalyzer class, without using the cherry-pick detection.