[Question] What does this do about churn (& related madness)? #6

faddat · 2016-02-01T07:29:47Z

churn (as pertains to p2p networks) - a node entering or leaving the network.

The only piece that I don't see here is that one. I'm going to describe the solution that we have planned for dealing with this below, and would love to hear your thoughts on to what extent this solution

A) Scales
B) Works alongside of what you've got now overall

My initial thought was to centralize storage, but centralization (blah blah blah)

My new thought is to use something like SyncThing to ensure that all of the nodes always have the same registry of images & container diffs. This way, when a node goes down, and something (might want to have a look at the unfinished github.com/superordinate/kdaemon) brings the containers back up, things are exactly as they were. In order for the kinds of use cases I'm pursuing to work out, I've got to create a situation where containers are relaunched when their host goes down, and also must account for the fact that we will not be able to predict nodes departure from the network.

(I should mention that I do not consider this a great or full solution)

stlalpha · 2016-02-03T15:44:51Z

We agree - and have been looking at some different techniques to keep the info blorbs synchronized (both purposefully executed saves, diffs, etc - as well as continuously updating cpu/mem state deltas for shadow nodes) - including something like syncthing - but maybe mechanizing it (or other appropriate mechanism) via the externalized cnvm tools so that all necessary states and bits are tracked and flagged for push as appropriate (really all about state maintenance).

So - I think using something like syncthing as a transport isnt a bad idea - but we may want/need a way to do things like effective QoS or prioritization of different data types across it. One of the datatypes could certainly be node/cluster state (for the "re-animation" post shut-down as you describe above).

Have you tried just shipping the filesystem structures across with it as docker stands? Does it work?

faddat · 2016-02-03T15:55:04Z

Actually, I am doing that right now. Here's the arch:

Glusterfs (surprisngly easy!) running on super-peers, which we're assuming the highest level of reliability from-- not 100% mind you, but-- high. The state bit does get tough, doesn't it? Where is state stored in a typical deploy? I've seen criu in action before, but I have no idea its structure. Anyway our leaf nodes will use the glusterfs volume driver to store their docker stuff. The super peer is a health-check runner, and if it finds that a node/docker is missing, it will then respawn that node's docker containers. I'll let you know how the test goes when it's finished :).

faddat · 2016-02-18T00:43:03Z

Looks like we are onto a zfs pool strategy. I will let you know how it
pans out in a global deployment....
On Feb 3, 2016 10:44 AM, "Jim McBride" [email protected] wrote:

We agree - and have been looking at some different techniques to keep the
info blorbs synchronized (both purposefully executed saves, diffs, etc - as
well as continuously updating cpu/mem state deltas for shadow nodes) -
including something like syncthing - but maybe mechanizing it (or other
appropriate mechanism) via the externalized cnvm tools so that all
necessary states and bits are tracked and flagged for push as appropriate
(really all about state maintenance).

So - I think using something like syncthing as a transport isnt a bad idea

but we may want/need a way to do things like effective QoS or
prioritization of different data types across it. One of the datatypes
could certainly be node/cluster state (for the "re-animation" post
shut-down as you describe above).

Have you tried just shipping the filesystem structures across with it as
docker stands? Does it work?

—
Reply to this email directly or view it on GitHub
#6 (comment).

faddat changed the title ~~[Question] What does this do about churn & health checks?~~ [Question] What does this do about churn (& related madness)? Feb 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] What does this do about churn (& related madness)? #6

[Question] What does this do about churn (& related madness)? #6

faddat commented Feb 1, 2016

stlalpha commented Feb 3, 2016

faddat commented Feb 3, 2016

faddat commented Feb 18, 2016

[Question] What does this do about churn (& related madness)? #6

[Question] What does this do about churn (& related madness)? #6

Comments

faddat commented Feb 1, 2016

stlalpha commented Feb 3, 2016

faddat commented Feb 3, 2016

faddat commented Feb 18, 2016