Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed in-memory key/value store for mapreduce #181

Merged
merged 12 commits into from
Mar 17, 2024
Merged

Conversation

mikkeldenker
Copy link
Member

This is the first step to move the mapreduce code to an AMPC model which will allow us to implement some distributed graph algorithms more efficiently. The main difference between the MPC and AMPC model is that we use a distributed key/value store for synchronization and to store the result for each round. This allows the algorithms to dynamically choose which segments of the previous rounds result it needs, and can therefore reduce the number of rounds needed overall.

The key/value store is a distributed BTreeMap<Vec<u8>, Vec<u8>> where the key is routed to the correct shard based on md5(key) % num_shards. Each shard uses raft based consensus for high availability. Shard rebalancing is currently not implemented, so adding/removing a shard will result in keys being routed to incorrect shards.

In the paper they use remote direct memory access which we don't use here, but we should be able to get close to the same average latency by batching requests from the algorithm implementation which will hopefully amortize the overhead enough.

also improve sonic error handling. there is no need for handle to return a sonic::Result, it's better that the specific message has a Result<...> as their response as this can then be properly handled on the caller side
… and was therefore a bit misleading. remove it and add a send_with_timeout_retry method to normal connection with sane defaults in .send method
…ky Response::Set(Ok(())) for internal raft entries
in raft, writes are written to a majority quorom. if we have a cluster of 3 nodes, this means that we can only be sure that 2 of the nodes get's the data. the test might therefore fail if we are unlucky and check the node that didn't get the data yet. by having a cluster of 2 nodes instead, we can be sure that both nodes always receives all writes.
@mikkeldenker mikkeldenker merged commit 37b6c7d into main Mar 17, 2024
3 checks passed
@mikkeldenker mikkeldenker deleted the mapreduce-dht branch March 19, 2024 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant