Mnesia

Start with CACHE_STRATEGY=mnesia.

Uses Mnesia database to store shortened URL data. The moving parts are:

local cache that is actually just access to Mnesia stored data, see Abbr.Mnesia.Local
cache synchronization service that reacts to Mnesia topology changes, see Abbr.Mnesia.Sync
entry point to cache interface, see Abbr.Mnesia

Mnesia automatically replicates the stored data across the cluster. However, you need to deal manually with network split situations. It offers no out the box solution by itself. The standard way of healing a Mnesia cluster is usually to restart Mnesia on nodes with obsolete data or just declare a master node. But, that looses the data on affected nodes, e.g. see how Pow does it:

https://github.com/danschultzer/pow/blob/master/lib/pow/store/backend/mnesia_cache/unsplit.ex
https://github.com/danschultzer/pow/blob/1e18930edd856c91360cbcd9cd3c8f37ad099e8f/lib/pow/store/backend/mnesia_cache.ex#L314-L330
https://elixirforum.com/t/help-reviewing-distributed-mnesia-cache-in-pow/24635/3

So, we needed a way to reconcile the data across cluster. There are a few libraries out there that can help, see:

unsplit
reunion

Unfortunately, there were issues preventing their direct usage. The final solution in Abbr.Mnesia.Sync was basically copied from those libraries. The gist of it is:

subscribe for :inconsistent_database Mnesia event
merge data between affected nodes

Here we take advantage of data itself, it's easily mergeable, as described in Working assumptions. Merging is facilitated by undocumented :mnesia_controller.connect_nodes/2. It exposes a hook where reconciliation function can do it's work.

Mnesia also offers replication. You can make use of it e.g. via Table.create_copy(Url, Node.self(), :ram_copies) which would result in local replica. This isn't strictly required, but is supposed to improve performance and availability. That being said, it was not the focus of this experiment, and so it is not part of the Mnesia solution.

Problems:

it's a little gossipy since all nodes in all parts of the cluster will try to sync, when in fact only a single node sync from one part of the split would do just fine

Here are some results. We have several scenarios:

Save and lookup waiting on table always
- due to network splits, there might be a moment where related table isn't available
- this scenario ensures we wait on the table for every storage access
- it also degrades performance a bit
Save and lookup waiting on table only on retries
- this is the current solution
- it first tries to access the table
- and only waits for it if not available
Lookup dirty reading, save waiting on table only on retries
- this introduces dirty reading
- Mnesia allows for so called "dirty" access, without transaction
- but there was no noticable performance gain

Check out Abbr.Mnesia.Local for details, with some code snippets for each of the options described.

Scenario	Splits	Execute errors	Req/s
Save and lookup waiting on table always	No	0	2149
	Yes	0	2366
Save and lookup waiting on table only on retries	No	0	2419
	Yes	0	2687
Lookup dirty reading, save waiting on table only on retries	No	0	2654
	Yes	3	2436

Resources:

How to start up mnesia in a cluster?
How to add a node to an mnesia cluster?
Data replication across nodes
Distribution and fault tolerance
Ensure Mnesia schema replication
Bulk loading into Mnesia
Loading large number of records into Mnesia
Ways to retrieve all records from a table

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mnesia.md

mnesia.md

Mnesia

Files

mnesia.md

Latest commit

History

mnesia.md

File metadata and controls

Mnesia