Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move root chains into sources #1342

Merged
merged 3 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,46 @@
Any breaking changes to the `topology.yaml` or `shotover` rust API should be documented here.
This assists us in knowing when to make the next release a breaking release and assists users with making upgrades to new breaking releases.

## 0.2.0

### topology.yaml

The root level of the topology.yaml is completely overhauled.
We have not observed the source_to_chain_mapping ever being used, so to simplify the topology.yaml format root level chains have been inlined sources.

For example, a topology.yaml that looked like this:

```yaml
---
sources:
redis_prod:
Redis:
listen_addr: "127.0.0.1:6379"
chain_config:
redis_chain:
- RedisSinkSingle:
remote_address: "127.0.0.1:1111"
connect_timeout_ms: 3000
source_to_chain_mapping:
redis_prod: redis_chain
```

Should now be rewritten like this:

```yaml
---
sources:
- Redis:
name: "redis_prod"
listen_addr: "127.0.0.1:6379"
chain:
- RedisSinkSingle:
remote_address: "127.0.0.1:1111"
connect_timeout_ms: 3000
```

### shotover rust api

## 0.1.11

### topology.yaml
Expand Down
19 changes: 8 additions & 11 deletions custom-transforms-example/config/topology.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
---
sources:
redis_prod:
Redis:
- Redis:
name: "redis"
listen_addr: "127.0.0.1:6379"
chain_config:
redis_chain:
- RedisGetRewrite:
result: "Rewritten value"
- RedisSinkSingle:
remote_address: "127.0.0.1:1111"
connect_timeout_ms: 3000
source_to_chain_mapping:
redis_prod: redis_chain
chain:
- RedisGetRewrite:
result: "Rewritten value"
- RedisSinkSingle:
remote_address: "127.0.0.1:1111"
connect_timeout_ms: 3000
42 changes: 16 additions & 26 deletions docs/src/examples/cassandra-cluster-shotover-sidecar.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,18 +55,14 @@ First we will create our `topology.yaml` file to have a single Cassandra source.
```yaml
---
sources:
cassandra_prod:
Cassandra:
- Cassandra:
listen_addr: "0.0.0.0:9043"
chain_config:
main_chain:
- CassandraPeersRewrite:
port: 9043
- CassandraSinkSingle:
remote_address: "127.0.0.1:9042"
connect_timeout_ms: 3000
source_to_chain_mapping:
cassandra_prod: main_chain
chain:
- CassandraPeersRewrite:
port: 9043
- CassandraSinkSingle:
remote_address: "127.0.0.1:9042"
connect_timeout_ms: 3000
```

Modify an existing `topology.yaml` or create a new one and place the above example as the file's contents.
Expand Down Expand Up @@ -134,22 +130,16 @@ The next section of this tutorial will cover adding rate limiting to your Cassan
```YAML
---
sources:
cassandra_prod:
Cassandra:
- Cassandra:
listen_addr: "0.0.0.0:9043"
chain_config:
main_chain:
- RequestThrottling:
max_requests_per_second: 40000
- CassandraPeersRewrite:
port: 9043
- CassandraSinkSingle:
remote_address: "127.0.0.1:9042"
connect_timeout_ms: 3000
named_topics:
testtopic: 5
source_to_chain_mapping:
cassandra_prod: main_chain
chain:
- RequestThrottling:
max_requests_per_second: 40000
- CassandraPeersRewrite:
port: 9043
- CassandraSinkSingle:
remote_address: "127.0.0.1:9042"
connect_timeout_ms: 3000
```

In this example we will set your `max_requests_per_second` to 40,000. This will allow a max of 40,000 queries per second to go through this Shotover instance, across all connections.
Expand Down
19 changes: 8 additions & 11 deletions docs/src/examples/redis-clustering-aware.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,15 @@ First we will modify our `topology.yaml` file to have a single Redis source. Thi
```yaml
---
sources:
redis_prod:
Redis:
- Redis:
name: "redis"
listen_addr: "0.0.0.0:6380"
chain_config:
redis_chain:
- RedisClusterPortsRewrite:
new_port: 6380
- RedisSinkSingle:
remote_address: "0.0.0.0:6379"
connect_timeout_ms: 3000
source_to_chain_mapping:
redis_prod: redis_chain
chain:
- RedisClusterPortsRewrite:
new_port: 6380
- RedisSinkSingle:
remote_address: "0.0.0.0:6379"
connect_timeout_ms: 3000
```

Modify an existing `topology.yaml` or create a new one and place the above example as the file's contents.
Expand Down
32 changes: 14 additions & 18 deletions docs/src/examples/redis-clustering-unaware.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,25 +45,21 @@ networks:
```yaml
---
sources:
redis_prod:
# define how shotover listens for incoming connections from our client application (`redis-benchmark`).
Redis:
- Redis:
name: "redis"
# define where shotover listens for incoming connections from our client application (`redis-benchmark`).
listen_addr: "0.0.0.0:6379"
chain_config:
redis_chain:
# configure Shotover to connect to the Redis cluster via our defined contact points
- RedisSinkCluster:
first_contact_points:
- "172.16.1.2:6379"
- "172.16.1.3:6379"
- "172.16.1.4:6379"
- "172.16.1.5:6379"
- "172.16.1.6:6379"
- "172.16.1.7:6379"
connect_timeout_ms: 3000
source_to_chain_mapping:
# connect our Redis source to our Redis cluster sink (transform).
redis_prod: redis_chain
chain:
# configure Shotover to connect to the Redis cluster via our defined contact points
- RedisSinkCluster:
first_contact_points:
- "172.16.1.2:6379"
- "172.16.1.3:6379"
- "172.16.1.4:6379"
- "172.16.1.5:6379"
- "172.16.1.6:6379"
- "172.16.1.7:6379"
connect_timeout_ms: 3000
```

Modify an existing `topology.yaml` or create a new one and place the above example as the file's contents.
Expand Down
137 changes: 33 additions & 104 deletions docs/src/user-guide/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,120 +20,49 @@ Shotover has an observability interface for you to collect Prometheus data from.

## topology.yaml

The topology file is currently the primary method for defining how Shotover behaves. Within the topology file you can configure sources, transforms and transform chains.
The topology file is the primary method for defining how Shotover behaves.

The below documentation shows you what each section does and runs through an entire example of a Shotover configuration file.

### `sources`

The sources top level resource is a map of named sources, to their definitions.

The sources section of the configuration file allow you to specify a source or origin for requests. You can have multiple sources and even multiple sources of the same type. Each is named to allow you to easily reference it.

A source will generally represent a database protocol and will accept connections and queries from a compatible driver. For example the Redis source will accept connections from any Redis (RESP2) driver such as [redis-py](https://github.com/andymccurdy/redis-py).
Consider this example `topology.yaml`:

```yaml
# This example listens on two different localhost ports and routes messages to a single redis instance on localhost.
# Requests received on port 1000 will have metrics recorded on the types of messages sent, while port 1001 will not have those metrics.
---
# The source section
# The list of sources
sources:

# The configured name of the source
my_named_redis_source:
# The source and any configuration needed for it
# This will generally include a listen address and port
Redis:
listen_addr: "127.0.0.1:6379"

# The configured name of the source
my_cassandra_prod:

# The sources and any configuration needed for it
# This will generally include a listen address and port
Cassandra:
listen_addr: "127.0.0.1:9042"

# First we define the source that will listen for connections from the client and then communicate to the client once a connection is opened.
- Redis:
name: "redis"
listen_addr: "127.0.0.1:1000"
# Next we define the transform chain that will process messages received by this source
chain:
# The QueryCounter transform intercepts messages and records metrics on the types of messages that pass through.
- QueryCounter:
name: "Main chain"
# The final transform is a sink, it receives requests from the previous transform and sends them to an actual DB instance.
# When it receives a response back it routes the response back through every transform in the chain and finally back to the client.
- RedisSinkSingle:
remote_address: "127.0.0.1:6379"
connect_timeout_ms: 3000

# A second source definition, this time we lack the QueryCounter transform.
- Redis:
name: "redis"
listen_addr: "127.0.0.1:1001"
chain:
- RedisSinkSingle:
remote_address: "127.0.0.1:6379"
connect_timeout_ms: 3000
```

### `chain_config` (Chain Configuration)

The `chain_config` top level resource is a map of named chains, to their definitions.

The chain_config section of the configuration file allows you to name and define a transform chain. A transform chain is represented as an array of transforms and their respective configuration. The order in which a transform chain, is the order in which a query will traverse it. So the first transform in the chain, will get the request from source first, and pass it to the second transform in the chain.
The `topology.yaml` defines multiple sources.
Each source defines an end to end connection to the database and any transformations that occur along the way.
The `chain` section is an array of transforms and their respective configuration.
The order in which a transform chain is defined, is the order in which a query will traverse it.
So the first transform in the chain is the source and will get the request from client first, then it will pass it to the second transform in the chain and so on.

As each transform chain is synchronous, with each transform being able to call the next transform in it's chain, the response from the upstream database or generated by a transform down the chain will be passed back up the chain, allowing each transform to handle the response.

The last transform in a chain should be a "terminating" transform. That is, one that passes the query on to the upstream database (e.g. `CassandraSinkSingle`) or one that returns a Response on it's own ( e.g. `DebugReturner`).

For example

```yaml
chain_config:
example_chain:
- One
- Two
- Three
- TerminatingTransform
```

A query from a client will go:

* `Source` -> `One` -> `Two` -> `Three` -> `TerminatingTransform`

The response (returned to the chain by the `TerminatingTransform`) will follow the reverse path:

* `TerminatingTransform` -> `Three` -> `Two` -> `One` -> `Source`

Under the hood, each transform is able to call it's down-chain transform and wait on it's response. Each Transform has it's own set of configuration values, options and behavior. See [Transforms](../transforms.md) for details.

The following example `chain_config` has three chains:

* `redis_chain` - Consists of a Tee, a transform that will copy the query to the named topic and *also* pass the query down-chain to a terminating transform `RedisSinkSingle` which sends to the query to a Redis server. Very similar to the `tee` linux program.
* `main_chain` - Also consists of a Tee that will copy queries to the same topic as the `redis_chain` before sending the query onto caching layer that will try to resolve the query from a redis cache before ending up finally sending the query to the destination Cassandra cluster via a `CassandraSinkSingle`

```yaml
# This example will replicate all commands to the DR datacenter on a best effort basis
---
chain_config:
# The name of the first chain
redis_chain:
# The first transform in the chain, in this case it's the Tee transform
- Tee:
behavior: Ignore
# The number of message batches that the tee can hold onto in it's buffer of messages to send.
# If they arent sent quickly enough and the buffer is full then tee will drop new incoming messages.
buffer_size: 10000
#The child chain, that Tee will asynchronously pass requests to
chain:
- QueryTypeFilter:
filter: Read
- Coalesce:
flush_when_buffered_message_count: 2000
- QueryCounter:
name: "DR chain"
- RedisSinkCluster:
first_contact_points: [ "127.0.0.1:2120", "127.0.0.1:2121", "127.0.0.1:2122", "127.0.0.1:2123", "127.0.0.1:2124", "127.0.0.1:2125" ]
connect_timeout_ms: 3000
#The rest of the chain, these transforms are blocking
- QueryCounter:
name: "Main chain"
- RedisSinkCluster:
first_contact_points: [ "127.0.0.1:2220", "127.0.0.1:2221", "127.0.0.1:2222", "127.0.0.1:2223", "127.0.0.1:2224", "127.0.0.1:2225" ]
connect_timeout_ms: 3000
```

### `source_to_chain_mapping` Chain Mapping

The `source_to_chain_mapping` top level resource is a map of source names to chain name. This is the binding that will link a defined source to chain and allow messages/queries generated by a source to traverse a given chain.

The below snippet would complete our entire example:

```yaml
source_to_chain_mapping:
redis_prod: redis_chain
```

This mapping would effectively create a solution that:

* All Redis requests are first batched and then sent to a remote Redis cluster in another region. This happens asynchronously and if the remote Redis cluster is unavailable it will not block operations to the current cluster.
* Subsequently, all Redis actions get identified based on command type, counted and provided as a set of metrics.
* The Redis request is then transform into a cluster aware request and routed to the correct node
37 changes: 14 additions & 23 deletions shotover-proxy/config/topology.yaml
Original file line number Diff line number Diff line change
@@ -1,28 +1,19 @@
# For an overview of topology configuration: https://docs.shotover.io/user-guide/configuration/#topologyyaml
---
# The list of sources that Shotover will receive messages from.
# The list of sources.
sources:
# The configured name of the source.
example_source:
# The source, change from Redis to the source type of the database protocol you are receiving messages in.
# For a list of possible sources: https://docs.shotover.io/sources
Redis:
# The source, change from Redis to the source type of the database protocol you are receiving messages in.
# For a list of possible sources: https://docs.shotover.io/sources
- Redis:
name: "redis"
listen_addr: "127.0.0.1:6379"
chain:
# A DebugPrinter transform, reports an INFO log for every message that passes through this transform.
# You should delete this transform and add as many other transforms in this chain as you need.
# For a list of possible transforms: https://docs.shotover.io/transforms/#transforms_1
- DebugPrinter

# The list of transform chains.
chain_config:
# The configured name of the chain.
example_chain:
# A DebugPrinter transform, reports an INFO log for every message that passes through this transform.
# You should delete this transform and add as many other transforms in this chain as you need.
# For a list of possible transforms: https://docs.shotover.io/transforms/#transforms_1
- DebugPrinter

# A NullSink transform, drops all messages it receives.
# You will want to replace this with a sink transform to send the message to a database.
# For a list of possible transforms: https://docs.shotover.io/transforms/#transforms_1
- NullSink

# A list of mappings from source name -> transform chain name.
source_to_chain_mapping:
example_source: example_chain
# A NullSink transform, drops all messages it receives.
# You will want to replace this with a sink transform to send the message to a database.
# For a list of possible transforms: https://docs.shotover.io/transforms/#transforms_1
- NullSink
Loading
Loading