Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes typos, formatting, and other doc related issues #106

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Want to contribute to the docs? See [CONTRIBUTING](CONTRIBUTING.md) for details

## Join the Community

For questions or support, join us on the [ReadySet Community Discord](https://discord.gg/readyset), post questions on our [Github forum](https://github.com/readysettech/readyset/discussions), or schedule an [office hours chat](https://calendly.com/d/d5n-y44-mbg/office-hours-with-ready-set) with our team.
For questions or support, join us on the [ReadySet Community Discord](https://discord.gg/readyset), post questions on our [GitHub forum](https://github.com/readysettech/readyset/discussions), or schedule an [office hours chat](https://calendly.com/d/d5n-y44-mbg/office-hours-with-ready-set) with our team.

Everyone is welcome!

Expand Down
3 changes: 3 additions & 0 deletions docs/concepts/dataflow.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# ReadySet Concepts

The heart of ReadySet is a query engine based on **partially-stateful, streaming dataflow**.

What's that? Let's break it down. First, we'll take a look at the basics of **stateful, streaming dataflow**, then
in a later section we'll consider how to improve memory overhead using **partial state**.

## Streaming dataflow

The basic premise of [streaming dataflow](https://en.wikipedia.org/wiki/Stream_processing) is that a **series
of operations** is applied to each element of a **stream** (a given sequence of data).

Expand Down Expand Up @@ -33,6 +35,7 @@ cache the final query results, and all non-leaf nodes effectively cache intermed
![High Level](../assets/high-level-graph.png)

## Putting it all together

As writes are applied to your database, the resulting data changes are immediately replicated to ReadySet. ReadySet incrementally
updates its cached query results to reflect these changes, thus replacing any hand-written cache eviction logic. When using ReadySet,
you just write traditional SQL queries, and ReadySet will keep the results up-to-date for you.
5 changes: 2 additions & 3 deletions docs/concepts/efficiency.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

# Memory Efficiency

In the first section, we discussed the **stateful, streaming dataflow** model and how to use it to
maintain cached state in real time. In this model, both reader and internal nodes of the graph store result sets.
Without care, this could lead to an impractical memory footprint.
Expand All @@ -14,7 +14,7 @@ What does this look like in practice? Let's come back to the query in the prior
SELECT id, author, title, url, vcount
FROM stories
JOIN (SELECT story_id, COUNT(*) AS vcount
FROM votes GROUP BY story_id)
FROM votes GROUP BY story_id)
AS VoteCount
ON VoteCount.story_id = stories.id WHERE stories.id = ?;
```
Expand All @@ -30,7 +30,6 @@ After this initial computation, ReadySet will keep those results up-to-date base
For example, after the info for story `42` has been cached, if any users upvote that story, then ReadySet will
increment the cached vote count for story `42` by `1` to reflect this data change.


When ReadySet is initially deployed, the cache starts off cold and the dataflow graph is entirely empty. During the initial cache warming phase,
most queries will be ones that ReadySet has never seen before (i.e., cache misses) so ReadySet will have to compute their
results from scratch.
Expand Down
6 changes: 4 additions & 2 deletions docs/concepts/example.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# Example: News Forum

To illustrate these concepts, we will walk through an example of using ReadySet for a news forum application inspired by HackerNews.

## Schema

First we define two tables to keep track of HackerNews stories and votes.

```sql
Expand All @@ -15,13 +17,13 @@ CREATE TABLE votes (user int, story_id int);
## Query

Next, we'll write a query that computes the vote count for each story and joins the
vote counts with other story metadata such as the author, title, and ID.
vote counts with other story metadata such as the author, title, and ID.

```sql
SELECT id, author, title, url, vcount
FROM stories
JOIN (SELECT story_id, COUNT(*) AS vcount
FROM votes GROUP BY story_id)
FROM votes GROUP BY story_id)
AS VoteCount
ON VoteCount.story_id = stories.id WHERE stories.id = ?;
```
Expand Down
14 changes: 10 additions & 4 deletions docs/concepts/overview.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# ReadySet Key Concepts

ReadySet is a lightweight caching solution that turns even the most complex SQL reads into **lightning fast lookups** with **no extra code**.

ReadySet slots between your application and database. It is wire-compatible with both MySQL and Postgres, so all you have to
Expand All @@ -8,11 +9,12 @@ queries are cached. Queries that aren't cached are proxied through ReadySet.
![Basic ReadySet Stack Diagram](../assets/rs_stack_diagram.png)

## How does ReadySet work under the hood?

Imagine a basic online forum application with `posts`, `users`, and `upvotes`. A simple database schema for this application might look like:

![Example DB Schema](../assets/reddit_sql_schema.png)

You can imagine a query like the one below, which returns all of the posts authored by a particular user:
You can imagine a query like the one below, which returns all the posts authored by a particular user:

```sql
SELECT
Expand All @@ -31,11 +33,12 @@ The graph for the query would look something like this:

![Example ReadySet Dataflow Graph](../assets/rs_example_dataflow.png)

Once the graph is constructed, if a user queries all of the posts authored by user id 4, ReadySet has the results ready so reads can be performed with **no additional compute**.
Once the graph is constructed, if a user queries all the posts authored by user id 4, ReadySet has the results ready, so reads can be performed with **no additional compute**.
Results are therefore returned instantaneously, regardless of the size of your database.


## How does ReadySet handle more complex queries?

One of the biggest advantages of this model is that latencies are not affected by query complexity. Let's take a look at a few more queries in this application:

Here's a point query for an article:
Expand Down Expand Up @@ -68,8 +71,8 @@ With ReadySet, read performance is not impacted by the size of the base tables o


### Memory Overhead
There's no free lunch– ReadySet trades off the cost of maintaining the dataflow graph in memory for excellent read performance. However, there are a few key ways we can mitigate this cost, such as
through **partial materialization**. You can think of partial materialization as a demand-driven cache-filling mechanism. With it, only a subset of the query results are stored in memory

There's no free lunch, ReadySet trades off the cost of maintaining the dataflow graph in memory for excellent read performance; however, there are a few key ways we can mitigate this cost, such as through **partial materialization**. You can think of partial materialization as a demand-driven cache-filling mechanism. With it, only a subset of the query results are stored in memory
based on common input parameters to the query. For example, if a query is parameterized on user IDs, then ReadySet would only cache the results of
that query for the active subset of users, since they are the ones issuing requests. Once ReadySet surpasses a developer-specified memory limit,
cache entries are evicted from memory based on a specified eviction strategy (e.g., LRU).
Expand All @@ -80,16 +83,19 @@ won’t take up any memory real estate in your ReadySet cluster.


### No Strong Consistency

ReadySet supports **eventual consistency**. There will be a small delay between when the write is issued and the cached result is updated in ReadySet to reflect that write.


![alt_text](../assets/rs_write_diagram.png)


## Is ReadySet a good fit for my application?

Like most caching solutions, ReadySet will have the greatest impact on read-heavy applications with non-uniform access patterns.
Most web applications have a high read-to-write traffic ratio, and therefore fit the bill. ReadySet generally provides immediate
performance improvements in these contexts.

## Can I try it?

Yes! ReadySet is source-available under the BSL 1.1 license. Check out our [GitHub](https://github.com/readysettech/readyset) for more info.
2 changes: 1 addition & 1 deletion docs/guides/cache/cache-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ CREATE CACHE [ALWAYS] [<name>] FROM <query>;

- `<name>` is optional. If a cache is not named, ReadySet automatically assigns an identifier.
- `<query>` is the full text of the query or the unique identifier assigned to the query by ReadySet, as seen in output of `SHOW PROXIED QUERIES`.
- `ALWAYS` is optional. If the `CREATE CACHE` command is executed inside a transaction (e.g., due to an ORM), use `ALWAYS` to run the command against ReadySet; otherwise, the command will be proxied to the upstream database with the rest of the transaction.
- `ALWAYS` is optional. If the `CREATE CACHE` command is executed inside a transaction (e.g., due to an ORM), use `ALWAYS` to run the command against ReadySet; otherwise, the command will be proxied to the upstream database with the rest of the transaction.

## View cached queries

Expand Down
4 changes: 2 additions & 2 deletions docs/guides/cache/profile-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@ If you already have performance monitoring in place, use that tooling to identif
SELECT calls, query FROM pg_stat_statements LIMIT 1;
```

If an error is returned, enable pg_stat_statments with the following command
If an error is returned, enable `pg_stat_statments` with the following command

```sh
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
```
!!! warning

In some environments, the pg_stat_statements extension may not be available. In that case, run `ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';` and restart your Postgres instance before re-running the `CREATE EXTENSION` command.
In some environments, the `pg_stat_statements` extension may not be available. In that case, run `ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';` and restart your Postgres instance before re-running the `CREATE EXTENSION` command.

=== "Using ReadySet metrics"

Expand Down
3 changes: 1 addition & 2 deletions docs/guides/connect/new-app/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,5 +337,4 @@ This page gives you examples for a few common Postgres drivers and ORMS for Pyth

- [Learn how ReadySet works under the hood](/concepts/overview.md)

- [Deploy with the ReadySet binary](/deploy/deploy-readyset-binary.md)

- [Deploy with the ReadySet binary](/deploy/deploy-readyset-binary.md)
3 changes: 1 addition & 2 deletions docs/guides/connect/new-app/ruby.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,5 +299,4 @@ This page gives you examples for a few common Postgres drivers and ORMS for Ruby

- [Learn how ReadySet works under the hood](/concepts/overview.md)

- [Deploy with the ReadySet binary](/deploy/deploy-readyset-binary.md)

- [Deploy with the ReadySet binary](/deploy/deploy-readyset-binary.md)
6 changes: 3 additions & 3 deletions docs/guides/deploy/production-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ The upstream database must be configured to allow ReadySet to connect to the dat

ReadySet uses Postgres' logical replication feature to keep the cache up-to-date as the underlying database changes.

- ReadySet must be connected to the primary database instance. ReadySet cannot work off an RDS [read replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html).
- ReadySet must be connected to the primary database instance. ReadySet cannot work off an RDS [read replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html).

- ReadySet does not support [row-level security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html). Make sure any RLS policies are disabled.

Expand All @@ -93,11 +93,11 @@ The upstream database must be configured to allow ReadySet to connect to the dat

- The [binary logging format](https://dev.mysql.com/doc/refman/5.7/en/binary-log-setting.html) must be set to `ROW`.

- ReadySet must be connected to the primary database instance. ReadySet cannot work off a [read replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html).
- ReadySet must be connected to the primary database instance. ReadySet cannot work off a [read replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html).

### Supabase

- In Supabase, [replication](https://www.postgresql.org/docs/current/logical-replication.html) is already enabled. However, you must change the `postgres` user's permissions to `SUPERUSER` so that ReadySet can create a replication slot.
- In Supabase, [replication](https://www.postgresql.org/docs/current/logical-replication.html) is already enabled. However, you must change the `postgres` user's permissions to `SUPERUSER` so that ReadySet can create a replication slot.

- ReadySet does not support [row-level security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html). Make sure any RLS policies are disabled.

Expand Down
4 changes: 2 additions & 2 deletions docs/guides/intro/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ To run through this process on a server, see [Deploy with binary](../deploy/depl

## How do you connect to ReadySet?

Once you have a ReadySet instance up and running, the next step is to connect your application by swapping out your database connection string to point to ReadySet instead. The specifics of how to do this vary by database client library, ORM, and programming language. See [Connect an App](../connect/existing-app.md) for examples.
Once you have a ReadySet instance up and running, the next step is to connect your application by swapping out your database connection string to point to ReadySet instead. The specifics of how to do this varies by database client library, ORM, and programming language. See [Connect an App](../connect/existing-app.md) for examples.

## When can you start caching queries?

Expand All @@ -69,4 +69,4 @@ To view a list of queries that are cached in ReadySet, connect a database SQL sh

## How do you stop caching a query?

To stop caching a query in ReadySet, connect a database SQL shell and run the the custom [`DROP CACHE`](../cache/cache-queries.md#remove-cached-queries) SQL command.
To stop caching a query in ReadySet, connect a database SQL shell and run the custom [`DROP CACHE`](../cache/cache-queries.md#remove-cached-queries) SQL command.
10 changes: 5 additions & 5 deletions docs/guides/intro/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ In this step, you'll use Docker Compose to start Postgres, load some sample data
Compose then does the following:

- Starts Postgres in a container called `db` and imports two tables from the [IMDb dataset](https://www.imdb.com/interfaces/).
- Starts ReadySet in a container called `cache`. For details about the CLI options used to start ReadySet, see the [CLI reference docs](../../reference/cli/readyset.md).
- Starts ReadySet in a container called `cache`. For details about the CLI options used to start ReadySet, see the [CLI reference docs](../../reference/cli/readyset.md).
- Creates a container called `app` for running a sample Python app against ReadySet.

## Step 2. Check snapshotting
Expand Down Expand Up @@ -101,7 +101,7 @@ Snapshotting can take between a few minutes to several hours, depending on the s

## Step 3. Cache queries

With snapshotting finished, ReadySet is ready for caching, so in this step, you'll get to know the dataset, run some queries, check if ReadySet supports them, and then cache them.
With snapshotting finished, ReadySet is ready for caching, so in this step, you'll get to know the dataset, run some queries, check if ReadySet supports them, and then cache them.

1. If necessary, reconnect the `psql` shell to ReadySet:

Expand Down Expand Up @@ -146,7 +146,7 @@ With snapshotting finished, ReadySet is ready for caching, so in this step, you'
tconst | averagerating | numvotes
-----------+---------------+----------
tt0093779 | 8.0 | 427192
(1 row)
(1 row)
```

1. Run a query that joins results from `title_ratings` and `title_basics` to count how many titles released in 2000 have an average rating higher than 5:
Expand Down Expand Up @@ -196,8 +196,8 @@ With snapshotting finished, ReadySet is ready for caching, so in this step, you'
WHERE title_basics.startyear = 2000 AND title_ratings.averagerating > 5;
```

!!! tip
!!! tip

To cache a query, you can provide either the full `SELECT` (as shown here) or the query ID listed in the `SHOW PROXIED QUERIES` output.

!!! note
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/sql-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ ReadySet supports the UTF-8 character set for strings and compares strings case-

### Writes

All `INSERT`, `UPDATE`, and `DELETE` statements sent to ReadySet are proxied to the upstream database. ReadySet receives new/changed data via the database's replication stream and updates its snapshot and cache automatically.
All `INSERT`, `UPDATE`, and `DELETE` statements sent to ReadySet are proxied to the upstream database. ReadySet receives new/changed data via the database's replication stream and updates its snapshot and cache automatically.

### Schema changes

Expand Down Expand Up @@ -304,7 +304,7 @@ But the following queries are not supported:
SELECT * FROM t1
JOIN t2 ON t1.x = t1.y;
```
``` sql
``` sql
-- This query doesn't compare using equality
SELECT * FROM t1
JOIN t2 ON t1.x > t2.x;
Expand Down Expand Up @@ -403,13 +403,13 @@ ReadySet supports the following components of the SQL expression language:
- `JSONB_SET_LAX()`
- `JSONB_STRIP_NULLS()`
- `JSONB_TYPEOF()`
- `LEAST()`
- `LEAST()`
- `MONTH()`
- `ROUND()`
- `SPLIT_PART()`
- `SUBSTR()` and `SUBSTRING()`
- `TIMEDIFF()`
- Aggregate functions (see [Aggregations](#aggregations))
- Aggregate functions (see [Aggregations](#aggregations))

ReadySet does not support the following components of the SQL expression language (this is not an exhaustive list):

Expand Down