-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update multicore-magic doc for 2.1.0
- Loading branch information
Showing
42 changed files
with
2,163 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml"> | ||
<head> | ||
<title>index</title> | ||
<link rel="stylesheet" href="./odoc.support/odoc.css"/> | ||
<meta charset="utf-8"/> | ||
<meta name="viewport" content="width=device-width,initial-scale=1.0"/> | ||
</head> | ||
<body> | ||
<main class="content"> | ||
<div class="by-name"> | ||
<h2>OCaml package documentation</h2> | ||
<ol> | ||
<li><a href="multicore-magic/index.html">multicore-magic</a></li> | ||
</ol> | ||
</div> | ||
</main> | ||
</body> | ||
</html> |
2 changes: 2 additions & 0 deletions
2
2.1.0/multicore-magic/Multicore_magic/Transparent_atomic/index.html
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Transparent_atomic (multicore-magic.Multicore_magic.Transparent_atomic)</title><meta charset="utf-8"/><link rel="stylesheet" href="../../../odoc.support/odoc.css"/><meta name="generator" content="odoc 2.4.0"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../../odoc.support/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> – <a href="../../index.html">multicore-magic</a> » <a href="../index.html">Multicore_magic</a> » Transparent_atomic</nav><header class="odoc-preamble"><h1>Module <code><span>Multicore_magic.Transparent_atomic</span></code></h1><p>A replacement for <code>Stdlib.Atomic</code> with fixes and performance improvements</p><p><code>Stdlib.Atomic.get</code> is incorrectly subject to CSE optimization in OCaml 5.0.0 and 5.1.0. This can result in code being generated that can produce results that cannot be explained with the OCaml memory model. It can also sometimes result in code being generated where a manual optimization to avoid writing to memory is defeated by the compiler as the compiler eliminates a (repeated) read access. This module implements <a href="#val-get"><code>get</code></a> such that argument to <code>Stdlib.Atomic.get</code> is passed through <code>Sys.opaque_identity</code>, which prevents the compiler from applying the CSE optimization.</p><p>OCaml 5 generates inefficient accesses of <code>'a Stdlib.Atomic.t array</code>s assuming that the array might be an array of <code>float</code>ing point numbers. That is because the <code>Stdlib.Atomic.t</code> type constructor is opaque, which means that the compiler cannot assume that <code>_ Stdlib.Atomic.t</code> is not the same as <code>float</code>. This module defines <a href="#type-t" title="t">the type</a> as <code>private 'a ref</code>, which allows the compiler to know that it cannot be the same as <code>float</code>, which allows the compiler to generate more efficient array accesses. This can both improve performance and reduce size of generated code when using arrays of atomics.</p></header><div class="odoc-content"><div class="odoc-spec"><div class="spec type anchored" id="type-t"><a href="#type-t" class="anchor"></a><code><span><span class="keyword">type</span> <span>!'a t</span></span><span> = <span class="keyword">private</span> <span><span class="type-var">'a</span> <span class="xref-unresolved">Stdlib</span>.ref</span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-make"><a href="#val-make" class="anchor"></a><code><span><span class="keyword">val</span> make : <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <a href="#type-t">t</a></span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-make_contended"><a href="#val-make_contended" class="anchor"></a><code><span><span class="keyword">val</span> make_contended : <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <a href="#type-t">t</a></span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-get"><a href="#val-get" class="anchor"></a><code><span><span class="keyword">val</span> get : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fenceless_get"><a href="#val-fenceless_get" class="anchor"></a><code><span><span class="keyword">val</span> fenceless_get : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-set"><a href="#val-set" class="anchor"></a><code><span><span class="keyword">val</span> set : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fenceless_set"><a href="#val-fenceless_set" class="anchor"></a><code><span><span class="keyword">val</span> fenceless_set : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-exchange"><a href="#val-exchange" class="anchor"></a><code><span><span class="keyword">val</span> exchange : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-compare_and_set"><a href="#val-compare_and_set" class="anchor"></a><code><span><span class="keyword">val</span> compare_and_set : <span><span><span class="type-var">'a</span> <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> bool</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fetch_and_add"><a href="#val-fetch_and_add" class="anchor"></a><code><span><span class="keyword">val</span> fetch_and_add : <span><span>int <a href="#type-t">t</a></span> <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> int</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-incr"><a href="#val-incr" class="anchor"></a><code><span><span class="keyword">val</span> incr : <span><span>int <a href="#type-t">t</a></span> <span class="arrow">-></span></span> unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-decr"><a href="#val-decr" class="anchor"></a><code><span><span class="keyword">val</span> decr : <span><span>int <a href="#type-t">t</a></span> <span class="arrow">-></span></span> unit</span></code></div></div></div></body></html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Multicore_magic (multicore-magic.Multicore_magic)</title><meta charset="utf-8"/><link rel="stylesheet" href="../../odoc.support/odoc.css"/><meta name="generator" content="odoc 2.4.0"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../odoc.support/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> – <a href="../index.html">multicore-magic</a> » Multicore_magic</nav><header class="odoc-preamble"><h1>Module <code><span>Multicore_magic</span></code></h1><p>This is a library of magic multicore utilities intended for experts for extracting the best possible performance from multicore OCaml.</p><p>Hopefully future releases of multicore OCaml will make this library obsolete!</p></header><nav class="odoc-toc"><ul><li><a href="#helpers-for-using-padding-to-avoid-false-sharing">Helpers for using padding to avoid false sharing</a></li><li><a href="#missing-atomic-operations">Missing <code>Atomic</code> operations</a></li><li><a href="#fixes-and-workarounds">Fixes and workarounds</a></li><li><a href="#avoiding-contention">Avoiding contention</a></li></ul></nav><div class="odoc-content"><h2 id="helpers-for-using-padding-to-avoid-false-sharing"><a href="#helpers-for-using-padding-to-avoid-false-sharing" class="anchor"></a>Helpers for using padding to avoid false sharing</h2><div class="odoc-spec"><div class="spec value anchored" id="val-copy_as_padded"><a href="#val-copy_as_padded" class="anchor"></a><code><span><span class="keyword">val</span> copy_as_padded : <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div><div class="spec-doc"><p>Depending on the object, either creates a shallow clone of it or returns it as is. When cloned, the clone will have extra padding words added after the last used word.</p><p>This is designed to help avoid <a href="https://en.wikipedia.org/wiki/False_sharing">false sharing</a>. False sharing has a negative impact on multicore performance. Accesses of both atomic and non-atomic locations, whether read-only or read-write, may suffer from false sharing.</p><p>The intended use case for this is to pad all long lived objects that are being accessed highly frequently (read or written).</p><p>Many kinds of objects can be padded, for example:</p><pre class="language-ocaml"><code>let padded_atomic = Multicore_magic.copy_as_padded (Atomic.make 101) | ||
|
||
let padded_ref = Multicore_magic.copy_as_padded (ref 42) | ||
|
||
let padded_record = Multicore_magic.copy_as_padded { | ||
number = 76; | ||
pointer = 1 :: 2 :: 3 :: []; | ||
} | ||
|
||
let padded_variant = Multicore_magic.copy_as_padded (Some 1)</code></pre><p>Padding changes the length of an array. If you need to pad an array, use <a href="#val-make_padded_array"><code>make_padded_array</code></a>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-make_padded_array"><a href="#val-make_padded_array" class="anchor"></a><code><span><span class="keyword">val</span> make_padded_array : <span>int <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> array</span></span></code></div><div class="spec-doc"><p>Creates a padded array. The length of the returned array includes padding. Use <a href="#val-length_of_padded_array"><code>length_of_padded_array</code></a> to get the unpadded length.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-length_of_padded_array"><a href="#val-length_of_padded_array" class="anchor"></a><code><span><span class="keyword">val</span> length_of_padded_array : <span><span><span class="type-var">'a</span> array</span> <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p>Returns the length of an array created by <a href="#val-make_padded_array"><code>make_padded_array</code></a> without the padding.</p><p><b>WARNING</b>: This is not guaranteed to work with <a href="#val-copy_as_padded"><code>copy_as_padded</code></a>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-length_of_padded_array_minus_1"><a href="#val-length_of_padded_array_minus_1" class="anchor"></a><code><span><span class="keyword">val</span> length_of_padded_array_minus_1 : <span><span><span class="type-var">'a</span> array</span> <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p>Returns the length of an array created by <a href="#val-make_padded_array"><code>make_padded_array</code></a> without the padding minus 1.</p><p><b>WARNING</b>: This is not guaranteed to work with <a href="#val-copy_as_padded"><code>copy_as_padded</code></a>.</p></div></div><h2 id="missing-atomic-operations"><a href="#missing-atomic-operations" class="anchor"></a>Missing <code>Atomic</code> operations</h2><div class="odoc-spec"><div class="spec value anchored" id="val-fenceless_get"><a href="#val-fenceless_get" class="anchor"></a><code><span><span class="keyword">val</span> fenceless_get : <span><span><span class="type-var">'a</span> <span class="xref-unresolved">Stdlib</span>.Atomic.t</span> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div><div class="spec-doc"><p>Get a value from the atomic without performing an acquire fence.</p><p>Consider the following prototypical example of a lock-free algorithm:</p><pre class="language-ocaml"><code>let rec prototypical_lock_free_algorithm () = | ||
let expected = Atomic.get atomic in | ||
let desired = (* computed from expected *) in | ||
if not (Atomic.compare_and_set atomic expected desired) then | ||
(* failure, maybe retry *) | ||
else | ||
(* success *)</code></pre><p>A potential performance problem with the above example is that it performs two acquire fences. Both the <code>Atomic.get</code> and the <code>Atomic.compare_and_set</code> perform an acquire fence. This may have a negative impact on performance.</p><p>Assuming the first fence is not necessary, we can rewrite the example using <a href="#val-fenceless_get"><code>fenceless_get</code></a> as follows:</p><pre class="language-ocaml"><code>let rec prototypical_lock_free_algorithm () = | ||
let expected = Multicore_magic.fenceless_get atomic in | ||
let desired = (* computed from expected *) in | ||
if not (Atomic.compare_and_set atomic expected desired) then | ||
(* failure, maybe retry *) | ||
else | ||
(* success *)</code></pre><p>Now only a single acquire fence is performed by <code>Atomic.compare_and_set</code> and performance may be improved.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fenceless_set"><a href="#val-fenceless_set" class="anchor"></a><code><span><span class="keyword">val</span> fenceless_set : <span><span><span class="type-var">'a</span> <span class="xref-unresolved">Stdlib</span>.Atomic.t</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> unit</span></code></div><div class="spec-doc"><p>Set the value of an atomic without performing a full fence.</p><p>Consider the following example:</p><pre class="language-ocaml"><code>let new_atomic = Atomic.make dummy_value in | ||
(* prepare data_structure referring to new_atomic *) | ||
Atomic.set new_atomic data_structure; | ||
(* publish the data_structure: *) | ||
Atomic.exchance old_atomic data_structure</code></pre><p>A potential performance problem with the above example is that it performs two full fences. Both the <code>Atomic.set</code> used to initialize the data structure and the <code>Atomic.exchange</code> used to publish the data structure perform a full fence. The same would also apply in cases where <code>Atomic.compare_and_set</code> or <code>Atomic.set</code> would be used to publish the data structure. This may have a negative impact on performance.</p><p>Using <a href="#val-fenceless_set"><code>fenceless_set</code></a> we can rewrite the example as follows:</p><pre class="language-ocaml"><code>let new_atomic = Atomic.make dummy_value in | ||
(* prepare data_structure referring to new_atomic *) | ||
Multicore_magic.fenceless_set new_atomic data_structure; | ||
(* publish the data_structure: *) | ||
Atomic.exchance old_atomic data_structure</code></pre><p>Now only a single full fence is performed by <code>Atomic.exchange</code> and performance may be improved.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fence"><a href="#val-fence" class="anchor"></a><code><span><span class="keyword">val</span> fence : <span><span>int <span class="xref-unresolved">Stdlib</span>.Atomic.t</span> <span class="arrow">-></span></span> unit</span></code></div><div class="spec-doc"><p>Perform a full acquire-release fence on the given atomic.</p><p><code>fence atomic</code> is equivalent to <code>ignore (Atomic.fetch_and_add atomic 0)</code>.</p></div></div><h2 id="fixes-and-workarounds"><a href="#fixes-and-workarounds" class="anchor"></a>Fixes and workarounds</h2><div class="odoc-spec"><div class="spec module anchored" id="module-Transparent_atomic"><a href="#module-Transparent_atomic" class="anchor"></a><code><span><span class="keyword">module</span> <a href="Transparent_atomic/index.html">Transparent_atomic</a></span><span> : <span class="keyword">sig</span> ... <span class="keyword">end</span></span></code></div><div class="spec-doc"><p>A replacement for <code>Stdlib.Atomic</code> with fixes and performance improvements</p></div></div><h2 id="avoiding-contention"><a href="#avoiding-contention" class="anchor"></a>Avoiding contention</h2><div class="odoc-spec"><div class="spec value anchored" id="val-instantaneous_domain_index"><a href="#val-instantaneous_domain_index" class="anchor"></a><code><span><span class="keyword">val</span> instantaneous_domain_index : <span>unit <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>instantaneous_domain_index ()</code> potentially (re)allocates and returns a non-negative integer "index" for the current domain. The indices are guaranteed to be unique among the domains that exist at a point in time. Each call of <code>instantaneous_domain_index ()</code> may return a different index.</p><p>The intention is that the returned value can be used as an index into a contention avoiding parallelism safe data structure. For example, a naïve scalable increment of one counter from an array of counters could be done as follows:</p><pre class="language-ocaml"><code>let incr counters = | ||
(* Assuming length of [counters] is a power of two and larger than | ||
the number of domains. *) | ||
let mask = Array.length counters - 1 in | ||
let index = instantaneous_domain_index () in | ||
Atomic.incr counters.(index land mask)</code></pre><p>The implementation ensures that the indices are allocated as densely as possible at any given moment. This should allow allocating as many counters as needed and essentially eliminate contention.</p><p>On OCaml 4 <code>instantaneous_domain_index ()</code> will always return <code>0</code>.</p></div></div></div></body></html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>index (multicore-magic.index)</title><meta charset="utf-8"/><link rel="stylesheet" href="../odoc.support/odoc.css"/><meta name="generator" content="odoc 2.4.0"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../odoc.support/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> – multicore-magic</nav><header class="odoc-preamble"><h1 id="multicore-magic-index"><a href="#multicore-magic-index" class="anchor"></a>multicore-magic index</h1></header><nav class="odoc-toc"><ul><li><a href="#library-multicore-magic">Library multicore-magic</a></li></ul></nav><div class="odoc-content"><h2 id="library-multicore-magic"><a href="#library-multicore-magic" class="anchor"></a>Library multicore-magic</h2><p>The entry point of this library is the module: <a href="Multicore_magic/index.html"><code>Multicore_magic</code></a>.</p></div></body></html> |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.