Skip to content
This repository has been archived by the owner on Oct 11, 2023. It is now read-only.

Commit

Permalink
Minimal fixes in DESIGN.md
Browse files Browse the repository at this point in the history
Some typos, broken local links, MD formatting.

PiperOrigin-RevId: 520211289
  • Loading branch information
ussuri authored and copybara-github committed Mar 29, 2023
1 parent 4738ad5 commit 7a20b4e
Showing 1 changed file with 34 additions and 32 deletions.
66 changes: 34 additions & 32 deletions doc/DESIGN.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!--*
# Document freshness: For more information, see go/fresh-source.
freshness: { owner: 'sergeygs' reviewed: '2022-09-29' }
freshness: { owner: 'sergeygs' reviewed: '2023-03-28' }
*-->

[TOC]
Expand All @@ -14,20 +14,21 @@ See [README](../README.md) for user documentation and most of the terminology.

## Execution Features

Centipede reasons about execution feedback in terms of *features*. A feature is
some unique behavior of the target on a given input. So, executing an input is a
way to compute the input's features.
Centipede reasons about the execution feedback in terms of *features*. A feature
is some unique behavior of the target on a given input. So, executing an input
is a way to compute the input's features.

The currently supported features (see [feature.h](feature.h) for details) are:
The currently supported features (see [feature.h](../feature.h) for details)
are:

* Control flow edges with
[8-bit counters](https://clang.llvm.org/docs/SanitizerCoverage.html#inline-8bit-counters)
.
* Simplified data flow edges: either {store-PC, load-PC} or {global-address,
load-PC}.
* Bounded control flow paths.
* Instrumented CMP instructions.
* ... more coming.
* Control flow edges with
[8-bit counters](https://clang.llvm.org/docs/SanitizerCoverage.html#inline-8bit-counters)
.
* Simplified data flow edges: either {store-PC, load-PC} or {global-address,
load-PC}.
* Bounded control flow paths.
* Instrumented CMP instructions.
* ... more coming.

However, the target may generate features of its own type, without having
Centipede to support it explicitly.
Expand All @@ -46,13 +47,13 @@ So, we try hard to eliminate all redundant executions.

Centipede state consists of the following:

* Corpus. A set of inputs. The corpus is a property of a group of fuzz targets
sharing the same input data format.
* Feature sets. For every corpus element we preserve the set of its features.
Features are a property of a specific target binary. Different binary (e.g.
from a different revision, or different build options, or from different
code) will have its own persistent feature set. A feature set is associated
with an input via the input's hash.
* Corpus. A set of inputs. The corpus is a property of a group of fuzz targets
sharing the same input data format.
* Feature sets. For every corpus element we preserve the set of its features.
Features are a property of a specific target binary. Different binary (e.g.
from a different revision, or different build options, or from different
code) will have its own persistent feature set. A feature set is associated
with an input via the input's hash.

On startup, Centipede loads the corpus, and checks which corpus elements have
their corresponding feature sets. Only when the feature set is not present for
Expand All @@ -62,16 +63,16 @@ an input in the corpus, Centipede will recompute it.

Centipede jobs run concurrently (in separate processes, potentially on different
machines). They peek at each other's corpus (and feature sets) periodically.
Every jobs writes only to its own persistent state, but can read any other job's
Every job writes only to its own persistent state, but can read any other job's
state concurrently with that job writing to it.

Centipede implements this via appendable storage format.

## Storage format

Very simple and inefficient homebrewed appendable data format is currently used,
see `PackBytesForAppendFile()` in [util.h](util.h). We may need to replace it
with something more efficient when this one stops scaling.
Very simple and inefficient home-brewed appendable data format is currently
used, see `PackBytesForAppendFile()` in [util.h](../util.h). We may need to
replace it with something more efficient when this one stops scaling.

## Out-of-process target execution

Expand All @@ -81,7 +82,7 @@ in batches as well.

The specific mechanism of execution and passing the data between processes can
be overridden. The default implementation is `CentipedeCallbacks::Execute()` in
[centipede_interface.h](centipede_interface.h).
[centipede_interface.h](../centipede_interface.h).

It is possible to override the execution to do it in-process, but this way
Centipede will lose the ability to set RAM and time limit, and it will not
Expand All @@ -92,7 +93,8 @@ tolerate crashes in the target.
Centipede is decoupled from the mechanism that collects the execution feedback.
Any source of feedback can be used: compiler instrumentation, run-time
instrumentation, simulation, hardware-based tracing, etc. The default
implementation in [runner_main.cc](runner_main.cc) and other `runner_*.cc` files
implementation in [runner_main.cc](../runner_main.cc) and other `runner_*.cc`
files
relies on
[SanitizerCoverage](https://clang.llvm.org/docs/SanitizerCoverage.html)

Expand All @@ -110,12 +112,12 @@ mutate, and which to discard.

Centipede tries to mutate only those corpus elements that have *rare* features.

TODO(kcc): [design] explain how this works.
TODO(kcc): explain how this works.

## Related reading

* [Entropic] Boosting fuzzer efficiency: an information theoretic perspective.
https://dl.acm.org/doi/abs/10.1145/3368089.3409748
* [Nezha]: Efficient Domain-Independent Differential Testing
https://www.cs.columbia.edu/~suman/docs/nezha.pdf
* centipede doesn't do most of that, but aspires to do more than that :)
* [Entropic] Boosting fuzzer efficiency: an information theoretic perspective.
https://dl.acm.org/doi/abs/10.1145/3368089.3409748
* [Nezha]: Efficient Domain-Independent Differential Testing
https://www.cs.columbia.edu/~suman/docs/nezha.pdf
* Centipede doesn't do most of that, but aspires to do more than that :)

0 comments on commit 7a20b4e

Please sign in to comment.