list rfc v0

carolynzech · Aug 27, 2024 · 6c35d23 · 6c35d23
1 parent 01e5ad6
commit 6c35d23
Show file tree

Hide file tree

Showing 2 changed files with 98 additions and 52 deletions.
diff --git a/rfc/src/SUMMARY.md b/rfc/src/SUMMARY.md
@@ -16,3 +16,4 @@
 - [0008-line-coverage](rfcs/0008-line-coverage.md)
 - [0009-function-contracts](rfcs/0009-function-contracts.md)
 - [0010-quantifiers](rfcs/0010-quantifiers.md)
+- [0011-list](rfcs/0011-list.md)
diff --git a/rfc/src/rfcs/0011-list.md b/rfc/src/rfcs/0011-list.md
@@ -1,5 +1,5 @@
-- **Feature Name:** List (`list`)
-- **Feature Request Issue:** *Link to issue*
+- **Feature Name:** List
+- **Feature Request Issue:** [#2573](https://github.com/model-checking/kani/issues/2573)
 - **RFC PR:** *Link to original PR*
 - **Status:** Under Review
 - **Version:** 0
@@ -8,89 +8,128 @@
 
 ## Summary
 
-Add a subcommand `kani list` that lists metadata about Kani's application to a project.
+Add a subcommand `kani list` that lists the locations of Kani attributes in a project.
 
 ## User Impact
 
-Currently, there is no automated way for a user to gather metadata about Kani's integration with their project. If, for example, a user wants a list of verified function contracts, they must:
-    - Search for the relevant contract attributes (currently `#[requires]`, `#[ensures]`, or `#[modifies]`),
-    - Compile a list of functions under contract based on the results of the above, then
-    - Search for all harnesses marked (`#[kani::proof_for_contract]`).
+Currently, there is no automated way for a user to gather metadata about Kani's integration with their project. If, for example, a user wants a list of contracts for their project, they must search for all the relevant contract attributes (currently `#[requires]`, `#[ensures]`, or `#[modifies]`) themselves. If done manually, this process is tedious, especially for large projects. Even with a short shell script, it is error-prone--if, for example, we introduce a new type of contract attribute, users would have to account for it when searching their project.
 
-If done manually, this process is tedious, especially for large projects. It is also error-prone--if, for example, we introduce a new type of contract attribute, users would have to account for it when searching their project.
-
-Internally, this feature will be useful for tracking our customers' use of Kani and our progress with standard library verification.
-
-Externally, users can leverage this feature to get a high-level view of which areas of their projects have Kani attributes (and, by extension, which areas are still in need of verification). This feature can act as a complement to our existing line-by-line coverage feature--one can imagine, for example, using this feature to *informally* identify which areas seem to have few to no harnesses, then using the coverage feature to *formally* identify which areas remain unverified.
+Internally, this feature will be useful for tracking our customers' use of Kani and our progress with standard library verification. Externally, users can leverage this feature to get a high-level view of which areas of their projects have Kani attributes (and, by extension, which areas are still in need of verification). This feature can act as a complement to our existing line-by-line coverage feature--one can imagine, for example, using this feature to *informally* identify which areas seem to have few to no harnesses, then using the coverage feature to *formally* identify which areas remain unverified.
 
 This feature will not cause any regressions for exisiting users.
 
 ## User Experience
 
-Users run a `kani list` subcommand, which prints to the terminal metadata about the harnesses and contracts, and stubs in each reachable module.
+Users run a `kani list` subcommand, which prints metadata about the harnesses, contracts, and stubs in each module.
 For `cargo kani`, this will be every module in the package; for `kani`, it is every module in the crate.
 This subcommand will not run verification; it will exit after printing the information.
 
-Kani will print the package's modules as a tree. The leaves of the tree are functions with Kani attributes. 
-After printing the tree, Kani will print the total count of certain attributes.
+There will be two options:
+- `--filter=[harnesses|contracts]`: Only output the information specified. In the example above, `--filter harnesses` would output the `mod verify` entry and `--filter contracts` would output the `mod implementation` entry. In the case where harnesses and contracts are in the same module, Kani will output only the functions that are harnesses or have contracts, depending on the filter.
+- `--message-format=[human|json]`: Specify the output format. The default is `human`, which prints to the terminal. The `json` option creates and writes to a JSON file instead.
+
+This subcommand will not fail. In the case that it does not find any Kani attributes, it will print a message informing the user of that fact.
+
+### Output Format
+
+The default format, `human`, will print the package's modules as a tree. The leaves of the tree are functions with Kani attributes. After printing the tree, Kani will print the total count of harnesses, stubs, and contracts.
 For example:
 
 ```
 crate example
-├── mod implementation:
+├── mod implementation
 │   └── fn foo 
-|       └── requires-contracts: #[requires::ub_checks::can_dereference(x)], #[requires::ub_checks::can_write(x)]
-|       └── ensures-contracts: #[ensures(|result| *result == *x + 1)]
-|       └── modifies-contracts: #[modifies(x)]
-|       └── proofs-for-contracts: verify_foo::check_foo_u32, verify_foo::check_foo_u64
-├── mod verify:
-|   └── fn check_foo_u32: HARNESS
-|       └── kani-attributes: #[kani::proof_for_contract(foo)]
-|   └── fn check_foo_u64: HARNESS
-|       └── kani-attributes: #[kani::proof_for_contract(foo)]
-│   └── fn check_func: HARNESS
-|       └── kani-attributes: #[kani::proof], #[kani::should_panic], #[kani::stub(rand::random, mock_random)]
-|   └── fn check_bar: HARNESS (BOLERO)
-|       └── kani-attributes: #[kani::proof], #[kani::unwind(5)]
+|       - requires-contracts:
+|          - ub_checks::can_dereference(x)
+|          - ub_checks::can_write(x)
+|       - ensures-contracts: 
+|           - |result| *result == *x + 1
+|       - modifies-contracts: 
+|           - x
+|       - proofs-for-contracts: 
+|           - verify::check_foo_u32, 
+|           - verify::check_foo_u64
+├── mod verify
+|   └── fn check_foo_u32
+|       - kani::proof_for_contract(foo)
+|   └── fn check_foo_u64
+|       - kani::proof_for_contract(foo)
+│   └── fn check_func
+|       - kani::proof
+|       - kani::should_panic
+|       - kani::stub(rand::random, mock_random)
+|   └── fn check_bar
+|       - kani::proof
+|       - kani::unwind(5)
 
 Totals:
-    - Harnesses (including Bolero): 4
-    - Bolero Harnesses: 1
+    - Harnesses: 4
     - Requires Contracts: 2
     - Ensures Contracts: 1
     - Modifies Contracts: 1
     - Stubs: 1
 ```
 
-There will be two options:
-- `--filter=[harnesses|contracts]`: Only output the information specified. In the example above, `--filter harnesses` would output the `mod verify` entry and `--filter contracts` would output the `mod implementation` entry. In the case where harnesses and contracts are in the same module, Kani will output only the functions that are harnesses or have contracts, depending on the filter.
-- `--message-format=[human|json]`: Specify the output format. The default is `human`, which prints to the terminal. The `json` option creates and writes to a JSON file instead.
+As the name implies, the goal of the `human` output is to be friendly for human readers. If the user wants an output format that's more easily parsed by a script, they can opt for the `json` option, which will output the same information in a more verbose, machine-readable format. By "more verbose" we mean that it will parse the attributes, so that instead of outputting `kani::stub(rand::random, mock_random)`, it would output:
 
-This subcommand will not fail. In the case that it does not find any Kani attributes, it will print a message informing the user of that fact.
+```json
+stub : {
+    original: rand::random,
+    target: mock_random
+}
+```
+and so on for other attributes.
 
 ## Software Design
 
-This is the beginning of the technical portion of the RFC.
-From now on, your main audience is Kani developers, so it's OK to assume readers know Kani architecture.
+We will add a new subcommand to `kani-driver` which follows the normal verification workflow of invoking `kani-compiler` and constructing a `Project` from the result. Then, instead of verifying the project, Kani will iterate over the `KaniMetadata` in the `Project` to construct the output.
 
-Please provide a high level description your design.
+### Contracts
 
-We will add a new subcommand to `kani-driver`, which, when invoked, follows the normal verification workflow of invoking `kani-compiler` and constructing a `Project` from the result. Then, instead of verifying the project, Kani will iterate over the `KaniMetadata` in the `Project` to construct the output.
+The `KaniMetadata` struct has all of the information that `list` requires, except for contracts information--it has the CBMC representation of the contracts, while this feature requires the unexpanded, raw text.
 
-For each function with contracts, I would like to store the fully qualified path and the contract text (e.g `core::num::add -> [requires(x == 1), ensures(result == 2)]`). I want the fully qualified path to uniquely identify the function--if I just stored `add`, then I wouldn't be able to tell multiple `add` functions apart. However, macro expansion happens before name resolution, so AFAIK by the time the fully qualified function name is available, the contract has already been expanded and the text the user wrote is gone. Does anyone have ideas for how to achieve what I want?
+To store the contracts information, we will change the contracts code generation library (in `kani_macros::sysroot::contracts`). Contracts are defined as attribute macros, e.g.:
+```rust
+#[proc_macro_attribute]
+pub fn requires(attr: TokenStream, item: TokenStream) -> TokenStream {
+    attr_impl::requires(attr, item)
+}
+```
 
-### Constructing the Tree
+The `item` `TokenStream` gives us the function declaration to which the attribute is applied, e.g. `fn foo() {}`, but that is insufficient to uniquely identify the function or construct the module tree. Instead, we will change the macro expansion handler (`ContractConditionsHandler`) to insert the contracts text as a separate attribute, e.g.:
 
-just iterate through the pretty names
+```rust
+// these are wrappers we already generate for contracts
+#[kanitool::recursion_check = #recursion_name]
+#[kanitool::checked_with = #check_name]
+...
+#[kanitool::contract_text(requires(x == 1))] // this is new
+```
 
+Then, in `kani-compiler`, we add a new branch of the code generation process for the `list` subcommand. When `codegen_crate` encounters this reachability type, it will:
+- Construct a `CodegenUnits` object.
+- Call a new `CodegenUnits` method that filters local crate instances to those with the `contract_text` attribute, then stores the following in a new `contracted_functions` field:
+    - the type of contract (`Requires`, `Ensures`, or `Modifies`), 
+    - the text of the contract, and 
+    - the fully qualified path to it.
+- Call `write_metadata` on the `CodegenUnits` object, which will construct a `KaniMetadata` object with a new field containing the contracted function information, then write it to a file.
 
+### Constructing the Tree
 
-- What are the main components that will be modified? (E.g.: changes to `kani-compiler`, `kani-driver`, metadata, proc-macros, installation...)
-- Will there be changes to the components interface?
-- Any changes to how these components communicate?
-- What corner cases do you anticipate?
+Once we've made the above updates to `KaniMetadata`, that struct will have all of the information we need for `kani list`. 
 
+To print the tree, Kani will read the metadata file and construct an intermediate data structure that maps each module to a list of its metadata. (This list will be sorted so that the output is stable across runs). Then, we'll iterate through that structure to print the tree. If the user passes a `--filter` option, Kani will only construct the map for the relevant metadata--e.g., if the user passes `--filter harnesses`, we will only map each module to its harnesses and ignore the collected contracts.
 
+Kani will parse the module paths to print the tree, inserting a new level for each delimiter `::`. For example, `example::implementation::verify` would parse to:
+
+```
+crate example
+├── mod implementation
+│   └── mod verify
+```
+
+### Calculating Totals
+Since we already have the information about each harness, contract, etc., calculating totals is as simple as taking the length of those data structures.
 
 **We recommend you to leave this empty for the first version of your RFC**.
 
@@ -113,13 +152,19 @@ Example:
 
 Make sure all open questions are addressed before stabilization.
 
+TODO bolero
+
 ## Out of scope / Future Improvements
 
-*Optional Section*: List of extensions and possible improvements that you predict for this feature that is out of
-the scope of this RFC.
+It would be nice to differentiate between regular Kani harnesses and Bolero harnesses. Bolero harnesses invoke Kani using conditional compilation, e.g.:
+
+```rust
+#[cfg_attr(kani, kani::proof)]
+fn check() {
+    bolero::check!()...
+}
+```
 
-Feel free to add as many items as you want, but please refrain from adding too much detail.
-If you want to capture your thoughts or start a discussion, please create a feature request.
-You are welcome to add a link to the new issue here.
+See [this blog post](https://model-checking.github.io/kani-verifier-blog/2022/10/27/using-kani-with-the-bolero-property-testing-framework.html) for more information.
 
-[^unstable_feature]: This unique ident should be used to enable features proposed in the RFC using `-Z <ident>` until the feature has been stabilized.
+There's no easy way for us to know whether a harness comes from Bolero, since Bolero takes care of rewriting the test to use Kani syntax and invoking the Kani engine. By the time the harness gets to Kani, there's no way for us to tell it apart from a regular harness. Fixing this would require some changes to our Bolero integration.