Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(torch): v1 resources added #2

Merged
merged 45 commits into from
Oct 30, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
2fad7fc
feat(torch): v1 resources added
tty47 Oct 23, 2023
e346102
feat(torch): add EOF
tty47 Oct 23, 2023
76749de
feat(torch): readme updaed
tty47 Oct 24, 2023
5e0ce53
feat(torch): metrics updates
tty47 Oct 24, 2023
781771b
feat(torch): new field added
tty47 Oct 24, 2023
6ae691b
feat(torch): refactor - comment endpoint
tty47 Oct 25, 2023
8e61ef6
feat(torch): ref error msg
tty47 Oct 25, 2023
1b5054a
feat(torch): fix error
tty47 Oct 25, 2023
55d3657
feat(torch): comment code - rename var
tty47 Oct 25, 2023
6b13c18
feat(torch): comment code - rename var
tty47 Oct 25, 2023
917e3bb
feat(torch): comment code - rename var
tty47 Oct 25, 2023
5be37ba
feat(torch): fix typos
tty47 Oct 25, 2023
9226395
feat(torch): fix typos
tty47 Oct 25, 2023
252b8f1
feat(torch): docs updated
tty47 Oct 25, 2023
493e1c5
feat(torch): add func to generate the metrics if we already have them…
tty47 Oct 25, 2023
f5d298e
feat(torch): split some files and rename
tty47 Oct 25, 2023
c853f95
Update Dockerfile
tty47 Oct 26, 2023
e831624
feat(torch): update typos in Dockerfiles
tty47 Oct 26, 2023
01f62a2
feat(torch): update typos in Dockerfiles
tty47 Oct 26, 2023
9011bdb
feat(torch): add namespace field to the config
tty47 Oct 26, 2023
47f6d0c
feat(torch): add queue to process the nodes when torch detects a even…
tty47 Oct 26, 2023
5b5cdb3
feat(torch): node consumer, it will check if the node is in the db, o…
tty47 Oct 26, 2023
b88eaf8
feat(torch): flag parse
tty47 Oct 26, 2023
d5d0d80
feat(torch): add libraries
tty47 Oct 26, 2023
163b985
feat(torch): change context to context.WithTimeout
tty47 Oct 26, 2023
236626c
feat(torch): reorder imports
tty47 Oct 26, 2023
7153aeb
feat(torch): reorder imports
tty47 Oct 26, 2023
3e32287
feat(torch): reorder imports
tty47 Oct 26, 2023
59df79f
feat(torch): add sts events to the queue
tty47 Oct 26, 2023
5a05043
feat(torch): reorder imports - add defaults
tty47 Oct 26, 2023
b6ab3df
feat(torch): reorder imports - add defaults
tty47 Oct 26, 2023
bfd39da
feat(torch): dockerfiles fixed comments
tty47 Oct 26, 2023
072bd1c
feat(torch): add a comment
tty47 Oct 26, 2023
ed82092
feat(torch): update readme
tty47 Oct 26, 2023
1cadb74
feat(torch): update image arch
tty47 Oct 26, 2023
58ca283
feat(torch): remove commented func
tty47 Oct 27, 2023
34bd11c
feat(torch): fix context usage
tty47 Oct 27, 2023
5d90aa4
feat(torch): update comment
tty47 Oct 27, 2023
64b4f24
feat(torch): use errGroup to handle the errors
tty47 Oct 27, 2023
5e66f19
feat(torch): update function comment
tty47 Oct 27, 2023
0be1462
feat(torch): fix goroutines execution
tty47 Oct 28, 2023
8606336
feat(torch): returns sts
tty47 Oct 28, 2023
d87a637
feat(torch): add validations - fix issues
tty47 Oct 28, 2023
d4b7a25
feat(torch): fix goroutines issue
tty47 Oct 28, 2023
76b8f07
feat(torch): increase timeouts
tty47 Oct 28, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ torch
.DS_Store
.idea

*otel-agent-celestia.yaml
*otel-agent-*.yaml
32 changes: 29 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,38 @@
FROM golang:1.21.0-bullseye AS builder
# stage 1 Generate celestia-appd Binary
tty47 marked this conversation as resolved.
Show resolved Hide resolved
FROM --platform=$BUILDPLATFORM docker.io/golang:1.21.3-alpine3.18 as builder

ARG TARGETOS
ARG TARGETARCH
ENV CGO_ENABLED=0
ENV GO111MODULE=on

WORKDIR /

COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o /go/bin/torch ./cmd/main.go
RUN CGO_ENABLED=${CGO_ENABLED} GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -o /go/bin/torch ./cmd/main.go

FROM alpine:latest
# stage 2
FROM docker.io/alpine:3.18.4
WORKDIR /
# Read here why UID 10001: https://github.com/hexops/dockerfile/blob/main/README.md#do-not-use-a-uid-below-10000
ARG UID=10001
ARG USER_NAME=torch

ENV USR_HOME=/home/${USER_NAME}

# hadolint ignore=DL3018
RUN adduser ${USER_NAME} \
-D \
-g ${USER_NAME} \
-h ${USR_HOME} \
-s /sbin/nologin \
-u ${UID}

COPY --from=builder /go/bin/torch .

EXPOSE 8080

ENTRYPOINT ["./torch"]
27 changes: 25 additions & 2 deletions Dockerfile_local
Original file line number Diff line number Diff line change
@@ -1,11 +1,34 @@
FROM golang:1.21.0-bullseye AS builder
# stage 1 Generate celestia-appd Binary
tty47 marked this conversation as resolved.
Show resolved Hide resolved
FROM --platform=$BUILDPLATFORM docker.io/golang:1.21.3-alpine3.18 as builder

ARG TARGETOS
ARG TARGETARCH

WORKDIR /
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
COPY torch /go/bin/torch

FROM alpine:latest
# stage 2
FROM docker.io/alpine:3.18.4
WORKDIR /
# Read here why UID 10001: https://github.com/hexops/dockerfile/blob/main/README.md#do-not-use-a-uid-below-10000
ARG UID=10001
ARG USER_NAME=torch

ENV USR_HOME=/home/${USER_NAME}

# hadolint ignore=DL3018
RUN adduser ${USER_NAME} \
-D \
-g ${USER_NAME} \
-h ${USR_HOME} \
-s /sbin/nologin \
-u ${UID}

COPY --from=builder /go/bin/torch .

EXPOSE 8080

ENTRYPOINT ["./torch"]
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
PROJECT_NAME := $(shell basename `pwd`)
REPOSITORY_NAME := $(shell basename `pwd`)
REGISTRY_NAME=ghcr.io/jrmanes
REGISTRY_NAME=ghcr.io/celestiaorg
LOCAL_DEV=local

# Go
Expand Down Expand Up @@ -69,4 +69,4 @@ kubectl_deploy: docker_build_local_push kubectl_apply
.PHYONY: kubectl_deploy

kubectl_remote_kustomize_deploy: docker_build_local_push_gh kubectl_kustomize
.PHYONY: kubectl_remote_kustomize_deploys
.PHYONY: kubectl_remote_kustomize_deploy
252 changes: 190 additions & 62 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,56 +2,91 @@

## Description

Torch is the **Trusted Peers Orchestrator**.
**Torch** is the ***Trusted Peers Orchestrator***.

This service was created with the idea to manage [Celestia Nodes](https://github.com/celestiaorg/celestia-node/) automatically.

By default, when you run some Bridge Nodes and Full Nodes, you have to specify in the Full Node the Bridge's multiaddress, this service does it automatically for you.
You can use Torch to manage the nodes connections from a config file and Torch will manage those nodes for you.

Torch access to the nodes defined in the config file and get's their multiaddress, then, it writes it to the specified path and shares the info with all the other peers defined.
Torch uses the Kubernetes API to manage the nodes, it gets their multi addresses information and stores them in a Redis instance, also, it provides some metrics to expose the node's IDs through the `/metrics` endpoint.

---

## Flow
## Workflow

Nodes side:
- Nodes check their `ENV` var during the start-up process
- If they don't have the value yet, they ask to Torch for it.
- They send a request to the service asking for the value -> phase-2
- If the service already has the addresses, return them, otherwise, check the nodes.
- We store the value in the config PVC in a file, to keep it there even if we restart the pod or update it, and we
will source the value with the `start.sh`


1) Torch checks the peers based on the config file, the scope is in its namespace.
- How does it work?
- Torch receives a request with the nodeName in the body, then, checks the config (to validate it) and
opens a connection to them.
- checks the multiaddr, and stores it in memory
- once it has the addresses, it creates a file in the config PVC with the TRUSTED_PEERS value (the path can be defined in the config)
2) Then, it restarts the nodes until all of the peers have the env var available.
![Torch Flow](./docs/assets/torch.png)

When Torch receives a new request to the path `/api/v1/gen` with the node name in the body, it will verify if the node received is in the config file, if so, it will start the process, otherwise, it will reject it.

There are two types of connections:

- Using `ENV Vars`: Torch gets the data from the config file and write the connection to the node, using the `containerSetupName` to access to the node and write to a file.
- If the value of the key `nodeType` is `da`. Torch will try to generate the node ID once the node it will be ready to accept connections (*`containerName` will be up & running*).
- Connection via `Multi Address`: The user can specify the `connectsTo` list of a node, that means the node will have one or more connections.
- You can either use the node name like:

```yaml
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
```

- or you can specify the full multi address:

```yaml
connectsTo:
- "/dns/da-bridge-1/tcp/2121/p2p/12D3KooWNFpkX9fuo3GQ38FaVKdAZcTQsLr1BNE5DTHGjv2fjEHG"
- "/dns/da-bridge-1/tcp/2121/p2p/12D3KooWL8cqu7dFyodQNLWgJLuCzsQiv617SN9WDVX2GiZnjmeE"
```

- If you want to generate the Multi address, you can either use the DNS or IP, to use dns, you will have to add the key `dnsConnections` and Torch will try to connect to this node, in the other hand, if you want to use IPs, just remove this key.
- Example:

```yaml
# This will use IP to connect to da-bridge-1-0
- peers:
- nodeName: "da-full-1-0"
nodeType: "da"
connectsTo:
- "da-bridge-1-0"
# This will use DNS to connect to da-bridge-1-0 & da-bridge-2-0
- peers:
- nodeName: "da-full-2-0"
nodeType: "da"
dnsConnections:
- "da-bridge-1"
- "da-bridge-2"
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
```

---

## API Paths

- `/config`
- **Method**: `GET`
- **Description**: returns the config added by the user, can be used to debug
- `/list`
- `/api/v1/config`
- **Method**: `GET`
- **Description**: returns the list of the pods available in it's namespace based on the config file
- `/gen`
- **Description**: Returns the config added by the user, can be used to debug
- `/api/v1/list`
- **Method**: `GET`
- **Description**: Returns the list of the pods available in it's namespace based on the config file
- `/api/v1/noId/<nodeName>`
- **Method**: `GET`
- **Description**: Returns the multi address of the node requested.
- `/api/v1/gen`
- **Method**: `POST`
- **Description**: starts the process to generate the trusted peers on the nodes based on the config
- **Body Example**:
- **Description**: Starts the process to generate the trusted peers on the nodes based on the config
- **Body Example**:

```json
{
"podName": "da-bridge-1"
}
```

- **Response Example**:

```json
{
"status": 200,
Expand All @@ -60,51 +95,144 @@ will source the value with the `start.sh`
}
}
```
- `/genAll`
- **Method**: `POST`
- **Description**: generate the config for all the peers in the config file
- **Body Example**:
```json
{
"podName":
[
"da-bridge-1",
"da-full-1"
]
}
```
- **Response Example**:
```json
{
"status": 200,
"body": {
"da-bridge-0": "/dns/da-bridge-0/tcp/2121/p2p/12D3KooWDMuPiHgnB6xwnpaR4cgyAdbB5aN9zwoZCATgGxnrpk1M",
"da-full-0": "/dns/da-full-0/tcp/2121/p2p/12D3KooWDCUaPA5ZQveFfsuAHHBNiAhEERo5J1YfbqwSZKtn9RrD"
}
}
```

- `/metrics`
- **Method**: `GET`
- **Description**: Prometheus metrics endpoint.

---

## How does it work?
## Config Example

Here is an example of the flow, using the config:

```yaml
---
mutualPeers:
- consensusNode: "consensus-validator-1"
- peers:
- nodeName: "da-bridge-1"
containerName: "da"
- nodeName: "da-full-1"
containerName: "da"
trustedPeersPath: "/tmp"
- nodeName: "consensus-full-1-0"
containerName: "consensus" # optional - default: consensus
containerSetupName: "consensus-setup" # optional - default: consensus-setup
connectsAsEnvVar: true
nodeType: "consensus"
connectsTo:
- "consensus-validator-1"
- peers:
- nodeName: "consensus-full-2-0"
connectsAsEnvVar: true
nodeType: "consensus"
connectsTo:
- "consensus-validator-1"
- peers:
- nodeName: "da-bridge-1-0"
connectsAsEnvVar: true
nodeType: "da"
connectsTo:
- "consensus-full-1"
- peers:
- nodeName: "da-bridge-2"
containerName: "da"
- nodeName: "da-full-2"
containerName: "da"
- nodeName: "da-bridge-2-0"
containerName: "da" # optional - default: da
containerSetupName: "da-setup" # optional - default: da-setup
connectsAsEnvVar: true
nodeType: "da"
connectsTo:
- "consensus-full-2"
- peers:
- nodeName: "da-bridge-3-0"
containerName: "da"
nodeType: "da"
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
- peers:
- nodeName: "da-full-1-0"
containerName: "da"
containerSetupName: "da-setup"
nodeType: "da"
dnsConnections:
- "da-bridge-1"
- "da-bridge-2"
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
- peers:
- nodeName: "da-full-2-0"
containerName: "da"
containerSetupName: "da-setup"
nodeType: "da"
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
- peers:
- nodeName: "da-full-3-0"
nodeType: "da"
connectsTo:
# all the nodes in line using IP
- "/ip4/100.64.5.103/tcp/2121/p2p/12D3KooWNFpkX9fuo3GQ38FaVKdAZcTQsLr1BNE5DTHGjv2fjEHG,/ip4/100.64.5.15/tcp/2121/p2p/12D3KooWL8cqu7dFyodQNLWgJLuCzsQiv617SN9WDVX2GiZnjmeE"
# all the nodes in line using DNS
- "/dns/da-bridge-1/tcp/2121/p2p/12D3KooWKsHCeUVJqJwymyi3bGt1Gwbn5uUUFi2N9WQ7G6rUSXig,/dns/da-bridge-2/tcp/2121/p2p/12D3KooWA26WDUmejZzU6XHc4C7KQNSWaEApe5BEyXFNchAqrxhA"
# one node per line, either IP or DNS
- "/dns/da-bridge-1/tcp/2121/p2p/12D3KooWKsHCeUVJqJwymyi3bGt1Gwbn5uUUFi2N9WQ7G6rUSXig"
- "/dns/da-bridge-2/tcp/2121/p2p/12D3KooWA26WDUmejZzU6XHc4C7KQNSWaEApe5BEyXFNchAqrxhA"
trustedPeersPath: "/tmp"
```

![Torch Flow](./docs/assets/torch.png)
### Another example

The architecture will contain:

- 1 Consensus - Validator
- 2 Consensus - non-validating mode - connected to the validator
- 1 DA-Bridge-1 - connected to the CONS-NON-VALIDATOR
- 1 DA-Bridge-2 - connected to the CONS-NON-VALIDATOR
- 1 DA-Full-Node-1 - connected to DA-BN-1
- 1 DA-Full-Node-2 - connected to DA-BN-1 & DA-BN-2 using DNS

```yaml
---
mutualPeers:
- consensusNode: "consensus-validator-1"
- peers:
- nodeName: "consensus-full-1-0"
connectsAsEnvVar: true
nodeType: "consensus"
connectsTo:
- "consensus-validator-1"
- peers:
- nodeName: "consensus-full-2-0"
connectsAsEnvVar: true
nodeType: "consensus"
connectsTo:
- "consensus-validator-1"
- peers:
- nodeName: "da-bridge-1-0"
connectsAsEnvVar: true
nodeType: "da"
connectsTo:
- "consensus-full-1"
- peers:
- nodeName: "da-bridge-2-0"
connectsAsEnvVar: true
nodeType: "da"
connectsTo:
- "consensus-full-2"
- peers:
- nodeName: "da-full-1-0"
nodeType: "da"
dnsConnections:
- "da-bridge-1"
connectsTo:
- "da-bridge-1-0"
- peers:
- nodeName: "da-full-2-0"
nodeType: "da"
dnsConnections:
- "da-bridge-1"
- "da-bridge-2"
connectsTo:
- "da-bridge-1-0"
- "da-bridge-2-0"
```

---
Loading