Skip to content

Commit

Permalink
Update journalling documentation
Browse files Browse the repository at this point in the history
In particular, mention the word "resilience", so that it is easy to Google/grep/search for.
  • Loading branch information
Kobzol committed Oct 14, 2024
1 parent 11d1604 commit 0055166
Showing 1 changed file with 27 additions and 13 deletions.
40 changes: 27 additions & 13 deletions docs/deployment/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,34 +76,48 @@ or using a terminal multiplexer like [tmux](https://en.wikipedia.org/wiki/Tmux).

## Resuming stopped/crashed server

When a server is started with a journal, it may be resumed even when a server crashed.
Journal is a file where server writes a serie of events.

You can start the server as follows:
The server supports resilience, which allows it to restore its state after it is stopped or if it crashes. To enable resilience, you can tell the server to log events into a *journal* file, using the `--journal` flag:

```bash
$ hq server start --journal /path/to/journal
```

If server is stopped or crashed, and you use the same command to start the server
and it will continue from the last point:
If the server is stopped or it crashes, and you use the same command to start the server (using the same journal file path), it will continue from the last point:

```bash
$ hq server start --journal /path/to/journal
```

This functionality restores the state of jobs and automatic allocation queues.
However, it does not restore worker connections; in the current version, new workers
have to be connected to the server after it restarts.

!!! warning

This functionality resumes the state of jobs and auto allocation queues,
not worker connections.
In the current version, new workers have to be connected to the server
when a new server is started.
If the server crashes, the last few seconds of progress may be lost. For example,
when a task is finished and the server crashes before the journal is written, then
after resuming the server, the task will be not be computed after a server restart.

### Exporting journal events
If you'd like to programmatically analyze events that are stored in the journal file, you can
export them to JSON using the following command:

```bash
$ hq journal export <journal-path>
```

The events will be read from the provided journal and printed to `stdout` encoded in JSON, one
event per line (this corresponds to line-delimited JSON, i.e. [NDJSON](http://ndjson.org/)).

You can also directly stream events in real-time from the server using the following command:
```bash
$ hq journal stream
```

!!! warning

If the server crashes, last few seconds of progress may be lost. For example
when a task is finished and the server crashes before the journal is written, then
after resumming the server, it will appear as not computed.
The JSON format of the journal events and their definition is currently unstable and can change
with a new HyperQueue version.

## Stopping server

Expand Down

0 comments on commit 0055166

Please sign in to comment.