Skip to content

Commit

Permalink
Cleaned up indirection section. (#222)
Browse files Browse the repository at this point in the history
* Cleaned up indirection section.

* Cleaned up aggregation section.

* Final clean up of advanced queries.
  • Loading branch information
janderland authored Nov 9, 2024
1 parent 818bce9 commit 6cda2b9
Show file tree
Hide file tree
Showing 3 changed files with 146 additions and 127 deletions.
132 changes: 78 additions & 54 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -537,79 +537,103 @@ <h1 id="advanced-queries">Advanced Queries</h1>
operations, FQL is capable of performing indirection and aggregation
queries.</p>
<h2 id="indirection">Indirection</h2>
<p>Indirection queries are similar to SQL joins. They associate
different groups of key-values via some shared data element.</p>
<p>In Foundation DB, indexes are implemented by having one key-value
(the index) point at another key-value. This is also called
“indirection”.</p>
<blockquote>
<p>Indirection is not yet included in the grammar, nor is it
implemented. The design of this feature is somewhat finalized.</p>
<p>🚧 Indirection is still being implemented.</p>
</blockquote>
<p>Suppose we have a large list of people, one key-value for each
<p>Indirection queries are similar to SQL joins. They associate
different groups of key-values via some shared data element.</p>
<p>In Foundation DB, indexes are implemented using indirection.
Suppose we have a large list of people, one key-value for each
person.</p>
<pre class="language-fql query"><code>/people(&lt;id:uint&gt;,&lt;firstName:str&gt;,&lt;lastName:str&gt;,&lt;age:int&gt;)=nil</code></pre>
<p>If we wanted to read all records with the last name of “Johnson”,
we’d have to perform a linear search across the entire “people”
directory. To make this kind of search more efficient, we can store an
index of last names in a separate directory.</p>
<pre class="language-fql query"><code>/index/last_name(&lt;lastName:str&gt;,&lt;id:uint&gt;)=nil</code></pre>
<pre class="language-fql query"><code>/people(
&lt;int&gt;, % ID
&lt;str&gt;, % First Name
&lt;str&gt;, % Last Name
&lt;int&gt;, % Age
)=nil</code></pre>
<p>If we wanted to read all records containing the last name
“Johnson”, we’d have to perform a linear search across the entire
“people” directory. To make this kind of search more efficient, we can
store an index for last names in a separate directory.</p>
<pre class="language-fql query"><code>/index/last_name(
&lt;str&gt;, % Last Name
&lt;int&gt;, % ID
)=nil</code></pre>
<p>If we query the index, we can get the IDs of the records containing
the last name “Johnson”.</p>
<pre class="language-fql query"><code>/index/last_name(&quot;Johnson&quot;,&lt;int&gt;)</code></pre>
<pre class="language-fql result"><code>/index/last_name(&quot;Johnson&quot;,23)=nil
/index/last_name(&quot;Johnson&quot;,348)=nil
/index/last_name(&quot;Johnson&quot;,2003)=nil</code></pre>
<p>FQL can forward the observed values of named variables from one
query to the next, allowing us to efficiently query for all people
with the last name of “Johnson”.</p>
<pre class="language-fql query"><code>/index/last_name(&quot;Johnson&quot;,&lt;id:uint&gt;)
query to the next. We can use this to obtain our desired subset from
the “people” directory.</p>
<pre class="language-fql query"><code>/index/last_name(&quot;Johnson&quot;,&lt;id:int&gt;)
/people(:id,...)</code></pre>
<pre class="language-fql result"><code>/people(23,&quot;Lenny&quot;,&quot;Johnson&quot;,22,&quot;Mechanic&quot;)=nil
/people(348,&quot;Roger&quot;,&quot;Johnson&quot;,54,&quot;Engineer&quot;)=nil
/people(2003,&quot;Larry&quot;,&quot;Johnson&quot;,8,&quot;N/A&quot;)=nil</code></pre>
<p>The first query returned 3 key-values containing the IDs of 23,
348, &amp; 2003 which were then fed into the second query resulting in
3 individual <a href="#single-reads">single reads</a>.</p>
<pre class="language-fql query"><code>/index/last_name(&quot;Johnson&quot;,&lt;id:uint&gt;)</code></pre>
<pre class="language-fql result"><code>/index/last_name(&quot;Johnson&quot;,23)=nil
/index/last_name(&quot;Johnson&quot;,348)=nil
/index/last_name(&quot;Johnson&quot;,2003)=nil</code></pre>
<h2 id="aggregation">Aggregation</h2>
<blockquote>
<p>The design of aggregation queries is not complete. This section
describes the general idea. Exact syntax may change. This feature is
not currently included in the grammar nor has it been implemented.</p>
<p>🚧 Aggregation is still being implemented.</p>
</blockquote>
<p>Aggregation queries read multiple key-values and combine them into
a single output key-value.</p>
<p>Foundation DB performs best when key-values are kept small. When <a
href="https://apple.github.io/foundationdb/blob.html">storing large
blobs</a>, the data is usually split into 10 kB chunks stored in the
value. The respective key contain the byte offset of the chunk.</p>
blobs</a>, the blobs are usually split into 10 kB chunks and stored as
values. The respective keys contain the byte offset of the chunks.</p>
<pre class="language-fql query"><code>/blob(
&quot;my file&quot;, % The identifier of the blob.
&quot;audio.wav&quot;, % The identifier of the blob.
&lt;offset:int&gt;, % The byte offset within the blob.
)=&lt;chunk:bytes&gt; % A chunk of the blob.</code></pre>
<pre class="language-fql result"><code>/blob(&quot;my file&quot;,0)=10e3_bytes
/blob(&quot;my file&quot;,10000)=10e3_bytes
/blob(&quot;my file&quot;,20000)=2.7e3_bytes</code></pre>
<pre class="language-fql result"><code>/blob(&quot;audio.wav&quot;,0)=10000_bytes
/blob(&quot;audio.wav&quot;,10000)=10000_bytes
/blob(&quot;audio.wav&quot;,20000)=2730_bytes</code></pre>
<blockquote>
<p>Instead of printing the actual byte strings in these results, only
the byte lengths are printed. This is an option provided by the CLI to
lower result verbosity.</p>
<p>❓ In the above results, instead of printing the actual byte
strings, only the byte lengths are printed. This is an option provided
by the CLI to lower result verbosity.</p>
</blockquote>
<p>This gets the job done, but it would be nice if the client could
obtain the entire blob instead of having to append the chunks
themselves. This can be done using aggregation queries.</p>
<p>FQL provides a pseudo data type named <code>agg</code> which
performs the aggregation.</p>
<pre class="language-fql query"><code>/blob(&quot;my file&quot;,...)=&lt;blob:agg&gt;</code></pre>
<pre class="language-fql result"><code>/blob(&quot;my file&quot;,...)=22.7e3_bytes</code></pre>
<p>Aggregation queries always result in a single key-value. With
non-aggregation queries, variables &amp; the <code>...</code> token
are resolved as actual data elements in the query results. For
aggregation queries, only aggregation variables are resolved.</p>
<p>A similar pseudo data type for summing integers could be provided
as well.</p>
<pre class="language-fql query"><code>/deltas(&quot;group A&quot;,&lt;int&gt;)</code></pre>
<pre class="language-fql result"><code>/deltas(&quot;group A&quot;,20)=nil
/deltas(&quot;group A&quot;,-18)=nil
/deltas(&quot;group A&quot;,3)=nil</code></pre>
<pre class="language-fql query"><code>/deltas(&quot;group A&quot;,&lt;sum&gt;)</code></pre>
<pre class="language-fql result"><code>/deltas(&quot;group A&quot;,5)=&lt;&gt;</code></pre>
obtain the entire blob as a single byte string. This can be done using
aggregation queries.</p>
<p>FQL provides a pseudo type named <code>append</code> which
instructs the query to append all byte strings found at the variable’s
location.</p>
<pre class="language-fql query"><code>/blob(&quot;audio.wav&quot;,...)=&lt;append&gt;</code></pre>
<pre class="language-fql result"><code>/blob(&quot;my file&quot;,...)=22730_bytes</code></pre>
<p>Aggregation queries always result in a single key-value.
Non-aggregation queries resolve variables &amp; the <code>...</code>
token into actual data elements in the query results. Aggregation
queries only resolve aggregation variables.</p>
<p>You can see all the supported aggregation types below.</p>
<table>
<thead>
<tr>
<th style="text-align: left;">Pseudo Type</th>
<th style="text-align: left;">Accepted Inputs</th>
<th style="text-align: left;">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left;"><code>append</code></td>
<td style="text-align: left;"><code>bytes</code> <code>str</code></td>
<td style="text-align: left;">Append arrays</td>
</tr>
<tr>
<td style="text-align: left;"><code>sum</code></td>
<td style="text-align: left;"><code>int</code> <code>num</code></td>
<td style="text-align: left;">Add numbers</td>
</tr>
<tr>
<td style="text-align: left;"><code>count</code></td>
<td style="text-align: left;"><code>any</code></td>
<td style="text-align: left;">Count key-values</td>
</tr>
</tbody>
</table>
<h1 id="using-fql">Using FQL</h1>
<p>The FQL project provides an application for executing queries and
exploring the data, similar to <code>psql</code> for Postgres. This
Expand Down
133 changes: 62 additions & 71 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -608,133 +608,124 @@ aggregation queries.

## Indirection

> 🚧 Indirection is still being implemented.

Indirection queries are similar to SQL joins. They associate
different groups of key-values via some shared data element.

In Foundation DB, indexes are implemented by having one
key-value (the index) point at another key-value. This is
also called "indirection".

> Indirection is not yet included in the grammar, nor is it
> implemented. The design of this feature is somewhat
> finalized.

In Foundation DB, indexes are implemented using indirection.
Suppose we have a large list of people, one key-value for
each person.

```language-fql {.query}
/people(<id:uint>,<firstName:str>,<lastName:str>,<age:int>)=nil
/people(
<int>, % ID
<str>, % First Name
<str>, % Last Name
<int>, % Age
)=nil
```

If we wanted to read all records with the last name of
If we wanted to read all records containing the last name
"Johnson", we'd have to perform a linear search across the
entire "people" directory. To make this kind of search more
efficient, we can store an index of last names in a separate
directory.
efficient, we can store an index for last names in
a separate directory.

```language-fql {.query}
/index/last_name(<lastName:str>,<id:uint>)=nil
/index/last_name(
<str>, % Last Name
<int>, % ID
)=nil
```

FQL can forward the observed values of named variables from
one query to the next, allowing us to efficiently query for
all people with the last name of "Johnson".
If we query the index, we can get the IDs of the records
containing the last name "Johnson".

```language-fql {.query}
/index/last_name("Johnson",<id:uint>)
/people(:id,...)
/index/last_name("Johnson",<int>)
```

```language-fql {.result}
/people(23,"Lenny","Johnson",22,"Mechanic")=nil
/people(348,"Roger","Johnson",54,"Engineer")=nil
/people(2003,"Larry","Johnson",8,"N/A")=nil
/index/last_name("Johnson",23)=nil
/index/last_name("Johnson",348)=nil
/index/last_name("Johnson",2003)=nil
```

The first query returned 3 key-values containing the IDs of
23, 348, & 2003 which were then fed into the second query
resulting in 3 individual [single reads](#single-reads).
FQL can forward the observed values of named variables from
one query to the next. We can use this to obtain our desired
subset from the "people" directory.

```language-fql {.query}
/index/last_name("Johnson",<id:uint>)
/index/last_name("Johnson",<id:int>)
/people(:id,...)
```

```language-fql {.result}
/index/last_name("Johnson",23)=nil
/index/last_name("Johnson",348)=nil
/index/last_name("Johnson",2003)=nil
/people(23,"Lenny","Johnson",22,"Mechanic")=nil
/people(348,"Roger","Johnson",54,"Engineer")=nil
/people(2003,"Larry","Johnson",8,"N/A")=nil
```

## Aggregation

> The design of aggregation queries is not complete. This
> section describes the general idea. Exact syntax may
> change. This feature is not currently included in the
> grammar nor has it been implemented.
> 🚧 Aggregation is still being implemented.

Aggregation queries read multiple key-values and combine
them into a single output key-value.

Foundation DB performs best when key-values are kept small.
When [storing large
blobs](https://apple.github.io/foundationdb/blob.html), the
data is usually split into 10 kB chunks stored in the value.
The respective key contain the byte offset of the chunk.
blobs are usually split into 10 kB chunks and stored as
values. The respective keys contain the byte offset of the
chunks.

```language-fql {.query}
/blob(
"my file", % The identifier of the blob.
"audio.wav", % The identifier of the blob.
<offset:int>, % The byte offset within the blob.
)=<chunk:bytes> % A chunk of the blob.
```

```language-fql {.result}
/blob("my file",0)=10e3_bytes
/blob("my file",10000)=10e3_bytes
/blob("my file",20000)=2.7e3_bytes
/blob("audio.wav",0)=10000_bytes
/blob("audio.wav",10000)=10000_bytes
/blob("audio.wav",20000)=2730_bytes
```

> Instead of printing the actual byte strings in these
> results, only the byte lengths are printed. This is an
> option provided by the CLI to lower result verbosity.
> ❓ In the above results, instead of printing the actual
> byte strings, only the byte lengths are printed. This is
> an option provided by the CLI to lower result verbosity.

This gets the job done, but it would be nice if the client
could obtain the entire blob instead of having to append the
chunks themselves. This can be done using aggregation
queries.
could obtain the entire blob as a single byte string. This
can be done using aggregation queries.

FQL provides a pseudo data type named `agg` which performs
the aggregation.
FQL provides a pseudo type named `append` which instructs
the query to append all byte strings found at the variable's
location.

```language-fql {.query}
/blob("my file",...)=<blob:agg>
/blob("audio.wav",...)=<append>
```

```language-fql {.result}
/blob("my file",...)=22.7e3_bytes
/blob("my file",...)=22730_bytes
```

Aggregation queries always result in a single key-value.
With non-aggregation queries, variables & the `...` token
are resolved as actual data elements in the query results.
For aggregation queries, only aggregation variables are
resolved.

A similar pseudo data type for summing integers could be
provided as well.

```language-fql {.query}
/deltas("group A",<int>)
```

```language-fql {.result}
/deltas("group A",20)=nil
/deltas("group A",-18)=nil
/deltas("group A",3)=nil
```
Non-aggregation queries resolve variables & the `...` token
into actual data elements in the query results. Aggregation
queries only resolve aggregation variables.

```language-fql {.query}
/deltas("group A",<sum>)
```
You can see all the supported aggregation types below.

```language-fql {.result}
/deltas("group A",5)=<>
```
| Pseudo Type | Accepted Inputs | Description |
|:------------|:----------------|:-----------------|
| `append` | `bytes` `str` | Append arrays |
| `sum` | `int` `num` | Add numbers |
| `count` | `any` | Count key-values |

# Using FQL

Expand Down
8 changes: 6 additions & 2 deletions docs/js/fql.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@
'false',
'clear',
'nil',
'any',
'int',
'uint',
'bool',
Expand All @@ -90,8 +91,9 @@
'bytes',
'uuid',
'tup',
'agg',
'append',
'sum',
'count',
].join(' '),
};

Expand All @@ -102,6 +104,7 @@
keywords: {
$$pattern: /[^:|]+/,
keyword: [
'any',
'int',
'uint',
'bool',
Expand All @@ -111,8 +114,9 @@
'bytes',
'uuid',
'tup',
'agg',
'append',
'sum',
'count',
],
},
};
Expand Down

0 comments on commit 6cda2b9

Please sign in to comment.