-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate with google/oss-fuzz for continuous fuzz-testing? #1030
Comments
Ah it looks like there is already some fuzzing or something similar done through littlefs-fuse? littlefs/.github/workflows/test.yml Line 570 in d01280e
This looks a little different to the kind of fuzzing that I'm accustomed too. |
Hi @silvergasp, thanks for creating an issue and volunteering on this. I think it would be quite neat to seem something like this integrated eventually. Unfortunately the timing is a bit rough. There's a large body of unreleased work in progress, and until that lands I'm not sure it makes sense prioritization wise to aggressively search out bugs in the current codebase. That is unless you're also volunteering to fix them :) That being said, I think this idea is a good idea eventually, that eventually just not being now unfortunately... |
Though there are still interesting things to discuss related to fuzz testing.
This seems very useful, eventually, when time can be allocated on bug fixing.
So this is quite an interesting topic. Fuzz tests on PRs seem like a good idea initially, but are actually a nightmare for contributors in practice. Quite a bit of effort actually went in to making sure the current test runner/tests are strictly deterministic, so any CI failure is a result of, and only a result of, changes in the given PR. This doesn't rule out randomness testing, and I've been using quite a bit of it in the above-mentioned body of work, but any randomness testing on PRs needs to be derived from a reproducible seed that was also tested on
Ah, not really. Describing that as "fuzz-like" is probably inaccurate. It's "fuzz-like" in that it's not fuzz testing. It's just a very high-level test (compiling/simulating littlefs on littlefs via fuse) that is high-level enough you no longer know exactly what is being tested, similar to fuzz testing. Still, it did find a number of unknown bugs during initial development. |
What are your thoughts on me re-targeting this fuzzing effort onto the unreleased work? I imagine that it'd be useful in stabilising the unreleased work. I've personally found fuzzing pretty useful in day-to-day development as it will often quickly find edge cases I hadn't thought about. I've often found security vulnerabilities e.g. buffer overflows etc. in my own work before I released it. I'm happy to contribute fixes, although I only have a surface level familiarity with the littlefs codebase.
Yeah that's a reasonable stance, the PR based CI integration is entirely optional and not a prerequisite for integration into oss-fuzz.
It is possible to configure libfuzzer to be deterministic via the "-seed" command line option. It's worth noting that I would be using clusterfuzz lite for PR based fuzzing which does upload the artifacts to github i.e. the fuzzing input that triggered a bug. So it is easy enough to reproduce a bug even if the CI isn't deterministic. e.g. it would look something like this. # download fuzz input from github action artifacts.
# Run the fuzz harness with a single input. No actual fuzzing is being performed here, just a replay.
./fuzz_mount downloaded-crash-input After a bit of a look into your existing random/fuzz testing it looks like you are doing something like open-loop fuzzing. It looks like it would require fairly minimal modifications to port your fuzz tests over to libfuzzer which would significantly improve the speed and performance of fuzzing. Libfuzzer starts off passing a bunch of random data into your fuzz harness, each time measuring the code-coverage. Then it picks the input that produced the most code coverage and randomly mutates the input to create new inputs. This process is repeated in such a way that the "goal" of the fuzzer becomes to maximise code-coverage in a closed-loop system. Black box fuzzing (existing) flowchart LR
A[Random Input] --> B[Fuzz Harness]
B --> A
With black-box fuzzing the probability of finding a bug does not change over time. Grey box fuzzing (libfuzzer) flowchart LR
A[Random Input] --> B[Fuzz Harness]
B --> C[Collect code coverage information]
C --> D[Mutate orgional input to maximize coverage]
D --> B
The probability of finding a bug in a closed-loop system like this increases over time. |
Oh, that's not a bad idea. I guess the only problem is that this work is still in flux and likely to change. I don't think I will have the bandwidth to respond to bug reports until this work enters the stabilization phase. Though at that point fuzz testing work like this would be very useful.
It's not so much the reproducibility, though that is neat, as much as it's trying to prevent new contributers/PRs from finding bugs they are not responsible for and are unlikely to know how to deal with.
Ah yes, let me be the first to admit that the fuzz testing in the But that doesn't prevent having additional testing to run afterwards. It would be interesting if clusterfuzz/oss-fuzz could generate a test case in littlefs's test toml format, so we could easily compile previously found bugs into a suite of small regression-preventing tests. |
No worries it seems like now isn't necessarily great timing. But if down the line you are nearing stabilisation and you want to go "hey please fuzz-test this part of the API" feel free to ping me and I'll have a crack at it.
Yeah, that seems reasonable. Perhaps this might be more useful when oss-fuzz hasn't found a new bug in a month or two. Then the probability of the PR fuzzer finding something that a contributor isn't responsible for would be fairly low. Especially given that oss-fuzz runs every night for a few hours on a distributed cluster. Whereas the CI fuzzer would be running on a single core for 5-10min in github actions.
I think the fuzz testing harnesses themselves seem to be quite extensive and pretty far from naive. But perhaps the method of driving the harnesses could be improved by switching to a coverage driven fuzzing framework like libfuzzer. It's worth noting that libfuzzer is shipped as a component of llvm/clang. So while it is true that it does add a dependency it's largely covered by just switching from gcc->clang.
Hmm I think it would be possible, although I'd have to do a bit more of a deep dive into your existing testing infrastructure to be sure. There is some work going into structure-aware fuzzing but that does require additional dependencies on protobuf for example. It might look something like toml->protobuf->fuzzing_seed_corpus. So it seems possible, but maybe a little complicated for a first pass. Another option for this might be to use google/fuzztest for the harnesses which will generate googletest regression harnesses when it finds a failure. google/fuzztest also has compatibility layers for libfuzzer and integrates nicely into oss-fuzz. But it does add a couple third-party dependencies and additional build complexity. |
Hey I'd like to suggest adding littlefs to google/oss-fuzz. If you aren't familiar with fuzz testing, here is a bit of a run down (from Wikipedia);
Google offers a free continuous fuzzing service called OSS-fuzz. If littlefs is integrated into oss-fuzz, the fuzz tests under littlefs will be built and then run once a day, to search for bugs and vulnerabilities in littlefs. This service can be integrated with the CI for littlefs, so that the fuzz tests are run for 10min or so for every pull request, preventing buggy code from being merged.
I've opened up a pull request to add a basic fuzz-testing harness here #1029. If you are keen on adding littlefs to oss-fuzz I'd be happy to champion the integration :)
The text was updated successfully, but these errors were encountered: