docs: improve documentation (ethereum#422)

This is basically a rewrite of the hive documentation. I've split up the docs into multiple files. README was getting a bit long and contained a mix of tutorials for different things and the API reference. In the new version, each major concept has its own page. Co-authored-by: rene <[email protected]>
ethereum-optimism · Feb 4, 2021 · 6cadf25 · 6cadf25
1 parent 170dac0
commit 6cadf25
Show file tree

Hide file tree

Showing 7 changed files with 1,071 additions and 366 deletions.
diff --git a/README.md b/README.md
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1 @@
+Start here: [overview.md](./overview.md).
diff --git a/docs/clients.md b/docs/clients.md
@@ -0,0 +1,126 @@
+[Overview] | [Hive Commands] | [Simulators] | [Clients]
+
+## Hive Clients
+
+This page explains how client containers work in Hive.
+
+Clients are docker images which can be instantiated by a simulation. A client definition
+consist of a Dockerfile and associated resources. Client definitions live in
+subdirectories of `clients/` in the hive repository.
+
+When hive runs a simulation, it first builds all client docker images using their
+Dockerfile, i.e. it basically runs `docker build .` in the client directory. Since most
+client definitions wrap an existing Ethereum client, and building the client from source
+may take a long time, it is usually best to base the hive client wrapper on a pre-built
+docker image from Docker Hub.
+
+Client Dockerfiles should support an optional argument named `branch`, which specifies the
+requested client version. This argument can be set by users by appending it to the client
+name like:
+
+    ./hive --sim my-simulation --client go-ethereum_v1.9.23,go_ethereum_v1.9.22
+
+See the [go-ethereum client definition][geth-docker] for an example of a client
+Dockerfile.
+
+## Client Lifecycle
+
+When the simulation requests a client instance, hive creates a docker container from the
+client image. The simulator can customize the container by passing environment variables
+with prefix `HIVE_`. It may also upload files into the container before it starts. Once
+the container is created, hive simply runs the entry point defined in the `Dockerfile`.
+
+For all client containers, hive waits for TCP port 8545 to open before considering the
+client ready for use by the simulator. If the client container does not open this port
+within a certain timeout, hive assumes the client has failed to start.
+
+Environment variables and files interpreted by the entry point define a 'protocol'
+between the simulator and client. While hive itself does not require support for any
+specific variables or files, simulators usually expect client containers to be
+configurable in certain ways. In order to run tests against multiple Ethereum clients, for
+example, the simulator needs to be able to configure all clients for a specific blockchain
+and make them join the peer-to-peer network used for testing.
+
+## Eth1 Client Requirements
+
+This section describes the requirements for Ethereum 1.x client wrappers in hive. Client
+entry point scripts must support this interface in order to be tested by existing Ethereum
+1.x-specific simulators.
+
+Clients must provide JSON-RPC over HTTP on TCP port 8545. They may also support JSON-RPC
+over WebSocket on port 8546, but this is not strictly required.
+
+### Files
+
+The simulator may customize client startup by placing these files into the client container:
+
+- `/genesis.json` contains Ethereum genesis state in the JSON format used by Geth. This
+  file is mandatory.
+- `/chain.rlp` contains RLP-encoded blocks to import before startup.
+- `/blocks/` directory containing `.rlp` files.
+
+On startup, client entry point scripts must first load the genesis block and state into
+the client implementation from `/genesis.json`. To do this, the script needs to translate
+from Geth genesis format into a format appropriate for the specific client implementation.
+The translation is usually done using a jq script. See the [openethereum genesis
+translator][oe-genesis-jq], for example.
+
+After the genesis state, the client should import the blocks from `/chain.rlp` if it is
+present, and finally import the individual blocks from `/blocks` in file name order. The
+reason for requiring two different block sources is that specifying a single chain is more
+optimal, but tests requiring forking chains cannot create a single chain. The client
+should start even if the blocks are invalid, i.e. after the import, the client's 'best
+block' should be the last valid, imported block.
+
+### Environment
+
+Clients must support the following environment variables. The client's entry point script
+may map these to command line flags or use them generate a config file, for example.
+
+| Variable                   | Value                |                                                |
+|----------------------------|----------------------|------------------------------------------------|
+| `HIVE_LOGLEVEL`            | 0 - 5                | configures log level of client                 |
+| `HIVE_NODETYPE`            | archive, full, light | sets sync algorithm                            |
+| `HIVE_BOOTNODE`            | enode URL            | makes client connect to another node           |
+| `HIVE_GRAPHQL_ENABLED`     | 0 - 1                | if set, GraphQL is enabled on port 8545        |
+| `HIVE_MINER`               | address              | if set, mining is enabled. value is coinbase   |
+| `HIVE_MINER_EXTRA`         | hex                  | extradata for mined blocks                     |
+| `HIVE_CLIQUE_PERIOD`       | decimal              | enables clique PoA. value is target block time |
+| `HIVE_CLIQUE_PRIVATEKEY`   | hex                  | private key for signing of clique blocks       |
+| `HIVE_SKIP_POW`            | 0 - 1                | disables PoW check during block import         |
+| `HIVE_NETWORK_ID`          | decimal              | p2p network ID                                 |
+| `HIVE_CHAIN_ID`            | decimal              | [EIP-155] chain ID                             |
+| `HIVE_FORK_HOMESTEAD`      | decimal              | [Homestead][EIP-606] transition block          |
+| `HIVE_FORK_DAO_BLOCK`      | decimal              | [DAO fork][EIP-779] transition block           |
+| `HIVE_FORK_TANGERINE`      | decimal              | [Tangerine Whistle][EIP-608] transition block  |
+| `HIVE_FORK_SPURIOUS`       | decimal              | [Spurious Dragon][EIP-607] transition block    |
+| `HIVE_FORK_BYZANTIUM`      | decimal              | [Byzantium][EIP-609] transition block          |
+| `HIVE_FORK_CONSTANTINOPLE` | decimal              | [Constantinople][EIP-1013] transition block    |
+| `HIVE_FORK_PETERSBURG`     | decimal              | [Petersburg][EIP-1716] transition block        |
+| `HIVE_FORK_ISTANBUL`       | decimal              | [Istanbul][EIP-1679] transition block          |
+| `HIVE_FORK_MUIRGLACIER`    | decimal              | [Muir Glacier][EIP-2387] transition block      |
+| `HIVE_FORK_BERLIN`         | decimal              | [Berlin][EIP-2070] transition block            |
+
+### Enode script
+
+Some tests require peer-to-peer node information of the client instance. The client
+container must contain an `/enode.sh` script that echoes the enode of the running
+instance. This script is executed by the Hive host in order to retrieve the enode URL.
+
+[geth-docker]: ../clients/go-ethereum/Dockerfile
+[oe-genesis-jq]: ../clients/openethereum/mapper.jq
+[EIP-155]: https://eips.ethereum.org/EIPS/eip-155
+[EIP-606]: https://eips.ethereum.org/EIPS/eip-606
+[EIP-607]: https://eips.ethereum.org/EIPS/eip-607
+[EIP-608]: https://eips.ethereum.org/EIPS/eip-608
+[EIP-609]: https://eips.ethereum.org/EIPS/eip-609
+[EIP-779]: https://eips.ethereum.org/EIPS/eip-779
+[EIP-1013]: https://eips.ethereum.org/EIPS/eip-1013
+[EIP-1679]: https://eips.ethereum.org/EIPS/eip-1679
+[EIP-1716]: https://eips.ethereum.org/EIPS/eip-1716
+[EIP-2387]: https://eips.ethereum.org/EIPS/eip-2387
+[EIP-2070]: https://eips.ethereum.org/EIPS/eip-2070
+[Overview]: ./overview.md
+[Hive Commands]: ./commandline.md
+[Simulators]: ./simulators.md
+[Clients]: ./clients.md
diff --git a/docs/commandline.md b/docs/commandline.md
@@ -0,0 +1,101 @@
+[Overview] | [Hive Commands] | [Simulators] | [Clients]
+
+## Running Hive
+
+The hive project is implemented in Go. You need to install Go version 1.13 or later to use
+hive. To run simulations, you also need a working Docker setup, and hive needs to be run on the
+same machine as dockerd. Using docker remotely is not supported at this time. We have also
+not tested hive extensively on any OS but Linux, so you must run Linux to use hive.
+
+To get hive, you first need to clone the repository to any location, then build the hive
+executable.
+
+    git clone https://github.com/ethereum/hive
+    cd ./hive
+    go build .
+
+All hive commands should be run from within the root of the repository. To run a
+simulation, use the following command:
+
+    ./hive --sim <simulation> --client <client(s) you want to test against>
+
+For example, if you want to run the `discv4` test against geth and openethereum, here is
+how the command would look:
+
+    ./hive --sim devp2p/discv4 --client go-ethereum,openethereum
+
+The client list may contain any number of clients. You can select a specific client
+version by appending it to the client name with `_`, for example:
+
+    ./hive --sim devp2p/discv4 --client go-ethereum_v1.9.22,go-ethereum_v1.9.23
+
+Simulation runs can be customized in many ways. Here's an overview of the available
+command-line options.
+
+`--client.checktimelimit <timeout>`: The timeout of waiting for clients to open up TCP
+port 8545. If a very long chain is imported, this timeout may need to be quite long. A
+lower value means that hive won't wait as long in case the node crashes and never opens
+the RPC port. Defaults to 3 minutes.
+
+`--docker.pull`: Setting this option makes hive re-pull the base images of all built
+docker containers.
+
+`--docker.output`: This enables printing of all docker container output to stderr.
+
+`--docker.nocache <expression>`: Regular expression selecting docker images to forcibly
+rebuild. You can use this option during simulator development to ensure a new image is
+built even when there are no changes to the simulator code.
+
+`--sim.timelimit <timeout>`: Simulation timeout. Hive aborts the simulator if it exceeds
+this time. There is no default timeout.
+
+`--sim.loglevel <level>`: Selects log level of client instances. Supports values 0-5,
+defaults to 3. Note that this value may be overridden by simulators for specific clients.
+This sets the default value of `HIVE_LOGLEVEL` in client containers.
+
+`--sim.parallelism <number>`: Sets max number of parallel clients/containers. This is
+interpreted by simulators. It sets the `HIVE_PARALLELISM` environment variable. Defaults
+to 1.
+
+`--sim.testlimit <number>`: Max number of tests to execute per client. This is interpreted
+by simulators. It sets the `HIVE_SIMLIMIT` environment variable.
+
+## Viewing simulation results (hiveview)
+
+The results of hive simulation runs are stored in JSON files containing test results, and
+hive also creates several log files containing the output of the simulator and clients. To
+view test results and logs in a web browser, you can use the `hiveview` tool. Build it
+with:
+
+    go build ./cmd/hiveview
+
+Run it like this to start the HTTP server:
+
+    ./hiveview --serve --logdir ./workspace/logs
+
+This command runs a web interface on <http://127.0.0.1:8080>. The interface shows
+information about all simulation runs for which information was collected.
+
+## Generating Ethereum 1.x test chains (hivechain)
+
+The `hivechain` tool allows you to create RLP-encoded blockchains for inclusion into
+simulations. Build it with:
+
+    go build ./cmd/hivechain
+
+To generate a chain of a desired length, run the following command:
+
+    hivechain generate -genesis ./genesis.json -length 200
+
+hivechain generates empty blocks by default. The chain will contain non-empty blocks if
+the following accounts have balance in genesis state. You can find the corresponding
+private keys in the hivechain source code.
+
+- `0x71562b71999873DB5b286dF957af199Ec94617F7`
+- `0x703c4b2bD70c169f5717101CaeE543299Fc946C7`
+- `0x0D3ab14BBaD3D99F4203bd7a11aCB94882050E7e`
+
+[Overview]: ./overview.md
+[Hive Commands]: ./commandline.md
+[Simulators]: ./simulators.md
+[Clients]: ./clients.md