Skip to content

Commit

Permalink
Rework about the full solution
Browse files Browse the repository at this point in the history
+ Add an example which can be used to test
+ Fix most of typo mistakes
+ Remove dyn timeout, fix commit filter, modify term construction
+ Test and fix script calls
+ Merge term management with log_entries
+ Simplify status management
+ Remove remote follower acknowledgement logic
+ Remove the aggregation mechanism for now

It also stop to lead when quorum is unreachable for that version.
  • Loading branch information
adrien-zinger committed Dec 2, 2023
1 parent 26093af commit 531c846
Show file tree
Hide file tree
Showing 67 changed files with 3,029 additions and 1,224 deletions.
418 changes: 342 additions & 76 deletions Cargo.lock

Large diffs are not rendered by default.

197 changes: 110 additions & 87 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,138 +1,161 @@
# Hook / Raft Library

Hook is the _Proof Of Concept_ of an implementation of the raft algorithm. It want to divide the work of a node from the raft logic (inspiration from hooks in git).
Hook is an implementation of the raft algorithm. It wants to extracts the
algorithm, just doing the election and manage the logs transfers between
nodes.

The implementation follow the rules of the Raft algorithm during his
execution expect for the connection of new nodes. The connection
process is described in [the init specification](./specs/initial flow charts/init.md).
execution.

## _Principia raftica_

The raft algorithm works with terms, and that's what hook does. By default the terms are empty because it doesn't have to be important in a first place. Even if you'll be able to hack this project and fills terms, the main target of hook is to know who's the leader in the network.

So Hook will connect all nodes of the network (accepting new nodes with a bootstrap strategy) and send basic pings as terms for the raft.
The raft algorithm works with terms, and that's what hook does.
The hook's goal is to commit a consensual approved a maximum of terms.
By default, terms are empty because it doesn't have to be important in a first
place. But you can choose what's inside a term by using a *hook*.

## Hook-Raft? 🪝

That's all? Noooo, I called the project Hook because it implements a logic as the hooks in git repository.
Hook implements a logic as the hooks in a git repository. You're able
to implement some behaviors that interact with the basic Raft algorithm.

First, it allows any user to define a content of terms. But it can also
interact with the default behaviors. That's let anyone to hack/try some
configuration easily.

There is a trait named *Hook* that you can implement however you want. That trait is given to the hook library through the entry point that is:
There is a trait named *Hook*. That trait is given to the hook library through
the entry point that is:

```rust
/// Start a new node
fn main() {
let rt = tokio::runtime::Runtime::new().expect("Runtime expected to start but failed");
match Node::new(DefaultHook {}).start(rt) {
Ok(_) => println!("Successfully exit"),
Err(err) => eprintln!("Node crash with error: {:?}", err),
}
pub trait Hook: Send + Sync {
fn update_node(&self) -> bool;
fn pre_append_term(&self, term: &Term) -> Option<usize>;
fn append_term(&self, term: &Term) -> bool;
fn commit_term(&self, term: &Term) -> bool;
fn prepare_term(&self) -> String;
fn retreive_term(&self, index: usize) -> Option<Term>;
fn retreive_terms(&self, from: usize, to: usize) -> Option<Vec<Term>>;
fn switch_status(&self, status: EStatus);
}
```

You can also define the settings manually with the function
`new_with_settings`, otherwise the library will look at a file named
`settings.toml` in the root folder. Look below what are the settings.


The Trait `Hook` can be a default **VOID** with the `DefaulHook`
object but can be whatever you want. This object is basically an observer
that the *Raft* algorithm will trigger on change.
that the *Raft* algorithm will trigger any time it require.

## Default Hook

Let's look at the disponible hooks trigerred:
The default hook binary will react with the following scripts or executable.
All of that script are optional, put a '.sample' extension or remove it to
enable the internal default behavior.

```bash
└── hooks
├── pre-append-term
├── append-term
├── apply-term # todo
├── commit-term
├── update-node
├── leader-change # todo
├── pre-add-connection # todo
├── post-add-connection # todo
├── remove-connection # todo
├── lost-connection # todo
├── request-vote # todo
├── receive-vote # todo
├── prepare-term
└── send-term # todo
└── hook
├── hook.bin
├── append_term
├── commit_term
├── pre_append_term
├── prepare_term
├── retreive_n_term
├── retreive_term
└── switch_status
```

- _pre-append-term_: A term append from the leader, the result of the script has to be "true", otherwise the term is rejected (Hook accepts all term by default)
- _apply-term_: You have a term with a content and you want to apply something to a state machine or something? It's here
- _append-term_: A term as just passed all the test and is stored in cache, waiting to be commited. The same term can be append several times
- _commit-term_: The term is considered as definitive by the current leader
- _leader-change_: A new leader won the election
- _pre-add-connection_: A new node want to join the network, the return of the script has to be "true" you accept the connection. All other response is considered has rejected. (Look at [default hook requirement](#default-hook-node-requirements))
- _post-add-connection_: A connection has been added in the network
- _remove-connection_: A node has been removed from the network
- _lost-connecton_: A node as failed to answer, or the leader failed to send heartbeat
- _request-vote_: Vote requested for a new leader
- _receive-vote_: Received a response for your personal vote request if you are a candidate
- _prepare-term_: If you are the leader, you can fill the content with the of `terms` with this script, the returned value is considered as the full raw content. This script is called `PREPARING_OFFSET` ms before each heartbeat.
- _send-term_: It's call when you're a leader, and you send a term, you can check if you successfully prepare the last term here
- _update-node_: A node require to connect or to change his status (follower/not follower) You can dismiss every connection if you want or limit the number of nodes

The status of the implementation is watched with the `config` crates and is updated on save in an acceptable time.
- _append_term_: A new term has to be applied. This might be volatile and you
may apply multiple times the same term. That's up to the user to manage his
own logic with that behavior. It takes 2 arguments, the term id and the
content. It doesn't have to dump anything on the standard output. In case of
failure, if you're a follower, remote leader will receive an error, if you're
a leader, you'll turn in idle and start a candidature.
- _commit_term_: The term is considered as definitive by the current leader.
Append once. It takes 2 arguments, the term id and its content.
- _pre_append_term_: A term append from a potential leader but it has to pass the user checks.
It takes 2 arguments, the id of the term and the content. To avoid gaps, the user should put
in the standard output the `latest term id + 1`. The default behavior is to accept gaps and
always print the first argument.
- _prepare_term_: If you are the leader, you can fill the terms by writing in
the standard output there content. Hook cares about its id and its
replication. As a leader, don't append the term now, wait the `append_term`
call. Called each `prepare_term_period`
- _retrieve_term_: If you're a leader, that hook serves to rebuild a term which
isn't in cache anymore. The terms to rebuild are supposed to be committed
previously. It takes 1 argument, the term id. It expect to read the
content of the term in the standard output. If the hook failed, the node
turns in idle until the next election. The default behavior is to create
a new "default" term (a term with default written in the content).
- _retrieve_n_term_: If Hook needs more than one term to rebuild, it will first
try to use that one instead of the *retrieve_term* hook. It takes 2
arguments, the begin and the end id. It expect to read on the
standard output a JSON formatted list of terms
with the format `[{'id':12,'content':'hello world'}]`.
- _switch_status_: Notification of all changes of status over the time, it
takes one argument "candidate"|"follower"|"leader". It doesn't expect any
output.

### Raft settings

When starting a new node, you can target a settings file. Note that you shouldn't run a node on a network with random settings because you may fail to connect.
When you start a node, you can target a settings file.

```toml
timeout_min = 150
timeout_max = 300
# Min and max value in milisecond of the election timeout. Timeout is randomly choosen between these two values

max_timeout_value = 300
# Maximum value in millisecond of the heartbeat timeout if you're a potential candidate
min_inc_timeout = 150
min_inc_timeout = 300
# Min and max value in millisecond of the random incrementation of ou timeout each time we received a new term.
# Min and max value in milisecond of the election timeout. Timeout is randomly
# choosen between these two values.
timeout_min = 500
timeout_max = 800

prepare_term_period = 80
# Value in milisecond that separe term preparations, default 80
# If this time is too short to finish the term preparation, an empty heartbeat
# will be send and the content will be used for the next term. The hook doesn't
# implement any problem management if you fail multiple times to send a term.
# You can manage it yourself with the `send-term` script
prepare_term_period = 80

nodes = ['12.158.20.36']
# Optional list of public known nodes in a network. If this list appear to be empty, the node won't connect to anything and will be the current leader of his own network.

# todo followers = ['15.126.208.72']
# Optional list of known followers
# List of public known nodes in the network.
nodes = ['12.13.14.15:8080']

addr = "127.0.0.1"
# Server local address, default "127.0.0.1"
port = "3000"
addr = "127.0.0.1"
# Port used between nodes to communicate, default "3000"
port = "3000"

follower = true
# If true, the current node will never ask for an election and will never be able to vote. Nevertheless you will receive all heartbeat and all information like a normal node. Some hooks will never be called obviously but you are a part of the network. If false, you will be considered as a potential candidate after a successfull bootstrap and will be able to vote.
# default true
# If true, the current node will never ask for an election. Nevertheless you
# will receive all heartbeat and all information like a normal node. Some hooks
# will never be called obviously but you are a part of the network. If false,
# you will be considered as a potential candidate.
#
# default false
follower = false

response_timeout = 20
# Value in millisecond before considering that a node will never respond
# Value in millisecond before considering that a remote node will never respond
response_timeout = 200
```

## Run The node!

_todo: can use hook raft as a library, or use a ready to use executable that is in a subproject._
## Run The node

## Default Hook Node Requirements
That repository contains a rust library with all the tools to make a private
implementation. The basic implementation as simple as:

Hook accepts all connection by default if requirements are ok.
```Rust
use hook_raft::*;

Rejected causes:

- The node want to be candidate but failed to communicate with another randomly chosen node in the follower list (I don't know if it's really useful)
- One of the nodes rejected your connection

## Bootstrapping

The bootstrap system isn't managed here. If you want to implement a bootstrap strategy, you can develop it your own server who use hook and check in the `update-node` script if the node who attempt to connect fill requirements.
/// Start a new node
fn main() {
let rt = tokio::runtime::Runtime::new().expect("Runtime expected to start but failed");
match Node::new(DefaultHook {}).start(rt) {
Ok(_) => println!("Successfully exit"),
Err(err) => eprintln!("Node crash with error: {:?}", err),
}
}
```

However, _Hook-Raft_ implementation is bootstrap friendly and will send by default all logs commited from the latest log known by the connecting node. That part is, after all, suceptible to move soon with the nexts _PR_! The ideas tempt to extract that process of logs retreivals from here. So in the future, the connecting node will certainly receive only the latest log commited.
## Some information

- Hook nodes communication is over HTTP.
- Hook scripts have to be executable by the local user to work properly.
- The default binary is agnostic to the content of terms. The diffusion, the reason
of why it's diffused, and the usage of the content is deferred to the user.
- Bootstrapping isn't managed. As well as the change of the cluster membership and
the log compaction.

4 changes: 4 additions & 0 deletions example/hook/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/target
term_*
*.log
test/hook
Loading

0 comments on commit 531c846

Please sign in to comment.