-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add guidelines for e2e tests #1466
Changes from 18 commits
154d56d
473204a
ae48c9a
38e07e9
62e8148
54aa5e0
da516eb
b3c620a
c6092a9
fee875a
eede6bd
eda0dc4
28d4cc2
8924fe2
2dae5b9
470ee26
f2f88d0
f869859
10a4e06
84f5cd3
05ab47d
29b036d
d9042ca
5f0f46a
1ecd2ab
2f1e8de
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,182 @@ | ||
## Updating the trace format and tests when adjusting the framework | ||
# End-to-end testing | ||
|
||
This folder contains end-to-end tests and related infrastructure. | ||
The goal of end-to-end tests is to test our application completely. | ||
End-to-end in this context means tests | ||
where the whole app (including command line interfaces, interfaces to external components like | ||
relayers, CometBFT, etc) is in scope for testing. | ||
|
||
End-to-end tests should tyoically be used only to perform basic sanity checks that the provided | ||
APIs and protocols work as expected. | ||
jtremback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
For more detailed tests, like exhaustively testing edge cases, | ||
utilize in-memory tests. | ||
The reasoning behind this is that end-to-end tests | ||
make it harder to actually locate errors, and are more brittle and might need to be changed extensively | ||
when unrelated components change. | ||
End-to-end tests can still be useful, and we need them, | ||
but when possible, we should prefer more local tests. | ||
|
||
At a high-level, every test case consists of the following steps. | ||
* The test starts a docker container, see [the startup script](testnet-scripts/start-docker.sh) | ||
* We run a defined sequence of actions and expected states, see as an example the steps for [testing the democracy module](steps_democracy.go) | ||
* Actions are any event that might meaningfully modify the system state, such as submitting transactions to a node, making nodes double-sign, starting a relayer, starting a new chain, etc. | ||
* Expected states define the state we expect after the action was taken. | ||
We might specify what we expect the balances of validators to be, | ||
which chains are running, etc. | ||
We don't have to specify a complete state after each step, but instead we only | ||
specify the parts of the state we care about. | ||
For example, after an action that sends tokens from one validator to another, | ||
we would usually just specify the expected, new validator balances, | ||
but not check what chains are currently running. | ||
* After each action, the state of the system is queried and compared against the expected state. | ||
|
||
## Defining a new test case | ||
|
||
This section explains how to define a new test case. For now, let's assume that | ||
all actions and state checks we want to perform already exist (see (actions.go)[actions.go] | ||
for possible actions and (state.go)[state.go] for possible state checks). | ||
Then what we need to do is the following: | ||
p-offtermatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* Create a new test config (or decide on an existing one to reuse), see (config.go)[config.go]. | ||
The test config governs the config parameters of validators and chains that can be run in the test, | ||
for example we can set the genesis parameters of a chain using `ChainConfig.GenesisChanges`. | ||
* Define a sequence of actions and state checks to perform for our test case. | ||
* Add the new test case to the main file (main.go)[main.go]. | ||
mpoke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For example, a short sequence of actions and state checks could look like this: | ||
``` | ||
steps := []Step{ | ||
Action: delegateTokensAction{ | ||
Chain: ChainID("provi"), | ||
From: ValidatorID("alice"), | ||
To: ValidatorID("alice"), | ||
Amount: 11000000, | ||
}, | ||
State: State{ | ||
ChainID("provi"): ChainState{ | ||
ValPowers: &map[ValidatorID]uint{ | ||
ValidatorID("alice"): 511, | ||
ValidatorID("bob"): 500, | ||
ValidatorID("carol"): 500, | ||
}, | ||
}, | ||
}, | ||
Action: delegateTokensAction{ | ||
Chain: ChainID("provi"), | ||
From: ValidatorID("alice"), | ||
To: ValidatorID("bob"), | ||
Amount: 99000000, | ||
}, | ||
State: State{ | ||
ChainID("provi"): ChainState{ | ||
ValPowers: &map[ValidatorID]uint{ | ||
ValidatorID("alice"): 511, | ||
ValidatorID("bob"): 599, | ||
ValidatorID("carol"): 500, | ||
}, | ||
}, | ||
}, | ||
}, | ||
``` | ||
In this sequence, we first use the `delegateTokensAction` | ||
to delegate 11000000 tokens from alice to herself, | ||
then check that alices voting power was set to 511 (note that 11000000 tokens correspond to 11 voting power here), | ||
then afterwards use the same action again, this time to delegate 99000000 | ||
p-offtermatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
tokens from alice to bob, and again check the voting powers, with the change that | ||
bobs voting power should now be 599. | ||
|
||
For most steps, we can reuse existing code, for example | ||
the actions necessary to start a provider and multiple consumer chains | ||
are already "packaged together" and available as | ||
`stepsStartChains` in (steps_start_chains.go)[steps_start_chains.go]. | ||
|
||
**Note:** The parts of the state that are *not* defined are just *not checked*. | ||
For example, if the balance of a validator is not listed in a state, it means we | ||
do not care about the balance of that validator in this particular state. | ||
|
||
### How to use relayers | ||
|
||
For relayers, there are currently several actions defined. | ||
There are some subtleties here. | ||
|
||
* `addChainToRelayer` gives the relayer the config for the chain and should be called before the relayer is expected to interact with the chain for the first time. | ||
* `createIbcClientsAction`, `addIbcConnectionAction` and `addIbcChannel` are used to create clients, connections and channels. | ||
* `relayPackets` relays all *currently queued* packets and their acknowledgements. | ||
* `startRelayer` starts the relayer, which means from now on, it will *automatically relay all packets*. | ||
Before `startRelayer`, it's possible to let packets queue up and not clear them. | ||
Afterwards, *all packets* will be relayed immediately, so it's no longer possible | ||
to e.g. delay the relaying of a packet. | ||
|
||
|
||
## Defining new actions | ||
|
||
It is necessary to define new actions when we want to test something that does not yet have the corresponding actions defined. | ||
For example, a new feature may introduce new transactions, and | ||
there is likely no existing action to submit these transactions to the chain. | ||
|
||
You can see the basic template for how to do this by looking at the actions in | ||
(actions.go)[actions.go]. | ||
The basic principle is to use `exec.Command` to execute a command inside the docker container. | ||
The pattern for this looks something like this: | ||
``` | ||
cmd := exec.Command("docker", "exec", | ||
tr.containerConfig.InstanceName, | ||
tr.chainConfigs[action.Chain].BinaryName, | ||
"the command to execute, for example tx", | ||
"argument 1", | ||
"argument 2") | ||
output, err := cmd.CombinedOutput() | ||
if err != nil { | ||
log.Fatal(err, "\n", string(bz)) | ||
} | ||
|
||
// potentially check something in the output, or log something, ... | ||
``` | ||
|
||
**Note:** Actions don't need to check that the state was modified correctly, | ||
since we have the state checks for this. | ||
Still, it's generally a good idea to do a basic check for errors, | ||
since in case the action encounters an error, | ||
we'd rather fail inside the action and immediately know something went wrong, | ||
rather than just getting a non-matching state during the state check that happens after the action. | ||
|
||
### Gas | ||
When submitting transactions, generally either specify a large enough amount of gas manually, | ||
or use `--gas=auto` with a large `--gas-adjustment`. | ||
You should avoid situations where transactions non-deterministically sometimes | ||
p-offtermatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
work and sometimes fail due to gas, as can happen with `--gas=auto` and no `--gas-adjustment`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is gas consumption non-deterministic? This seems like an important aspect of our e2e testing framework (and probably a reason for our tests sometimes failing). Could you please expand on this section? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know why, but you can verify it - run a testnet, then try submitting the same tx again and again with --gas=auto and --gas-adjustment=1. Occasionally, the tx will run out of gas. I used this script to run a tx again and again: Here is the relevant output:
I expanded the section a bit, please take another look. I'm not sure if there is anything specifically you would like to see here, if so, let me know There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The section is good. @MSalopek do you have more context on the non-determinism. It's really intriguing |
||
### Waiting for blocks/time | ||
|
||
To wait for blocks to be produced or for time to pass, | ||
you should avoid writing your own logic, and reuse the existing | ||
utility functions `testConfig.waitBlocks` and `testConfig.WaitTime`. | ||
These already take care of subtleties, like `WaitTime` working with | ||
(CometMock)[https://github.com/informalsystems/CometMock] to | ||
advance the time instead of sleeping. | ||
|
||
## Defining new state checks | ||
|
||
When we want to check a part of the state that was never accessed before, it | ||
might be necessary to define a new state check. | ||
This is done by adding new fields to `ChainState` in (state.go)[state.go]. | ||
|
||
mpoke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
We also need to populate the newly added fields by querying the actual system state, | ||
which is done in `getChainState`. | ||
Typically, this is ultimately done, like actions, by issuing commands to the chain binary | ||
inside the docker container. See how this is done e.g. for `getBalance`. | ||
|
||
## Traces | ||
|
||
p-offtermatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
It is possible to dump the test cases (in the form of actions+state checks) | ||
to files and read them from files, instead of just having them defined inside of Go code. | ||
The reasoning behind this is that | ||
with this, it becomes possible to generate test traces e.g. outside of | ||
the Go code via other tools. | ||
You should not need to write these json files by hand. | ||
If you want to just write an end-to-end test, write it inside the | ||
Go files, as outlined above. | ||
|
||
### Updating the trace format and tests when adjusting the framework | ||
|
||
Some things in the test framework should stay consistent, in particular with respect to the trace format. | ||
When adding or modifying actions, please follow these guidelines: | ||
|
@@ -12,13 +190,26 @@ When adding a new proposal type: | |
* add a case for your proposal type to `json_utils.go/UnmarshalProposalWithType` | ||
* add a generator for your proposal type in `state_rapid_test.go` and add it to the `GetProposalGen` function | ||
|
||
## Regenerating Traces | ||
### Regenerating Traces | ||
|
||
The traces in `tracehandler_testdata` are generated by the test `trace_handlers_test.go/TestWriteExamples`. | ||
|
||
You can regenerate them by running `make e2e-traces` in the root of the repo. | ||
|
||
## Running against traces | ||
### Running against traces | ||
|
||
To run a test trace in the e2e tests, run `go run . --test-file $TRACEFILE::$TESTCONFIG`. | ||
See `--help` for more details. | ||
|
||
## Debugging and using the e2e infrastructure to set up a testnet | ||
|
||
mpoke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
When something in the tests goes wrong, a nice thing about the tests is that the | ||
docker container doesn't get killed. | ||
You can sh into the docker container via e.g. `docker exec -it "testinstance" sh` and manually look around. | ||
p-offtermatt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Useful pointers are: | ||
* Look at logs in the nodes' home folder, i.e., `/$CHAIN_ID/validator$VAL_ID` | ||
* Query/Run txs on the running apps (find out the relevant addresses and node homes to use e.g. by running `htop "binaryname"`) | ||
|
||
To debug an action, | ||
you can temporarily add a very long `time.Sleep` inside the action you are interested in, then sh into the docker container | ||
and e.g. try running the commands from the action yourself to see what happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a tl;dr or cheat sheet here, about running different kinds of tests & traces with different config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think
go run ./tests/e2e/... --help
explains it properly?I added a note to run that. If you think it's not helpful and there should be more here,
please lmk, otherwise just resolve this :)