-
Notifications
You must be signed in to change notification settings - Fork 13
Deploy the Teztnets Infrastructure Stack
Once the cluster is ready, we deploy the teztnets infrastructure:
- octez nodes and bakers for various networks
- faucet frontend & backend
- P2P load balancers for bootstrap
- https ingresses for octez RPC and faucets
- status page
With the gcloud CLI configured and pointing to your project:
- clone this repository
- "point" your teztnets stack to the correct infra stack that you deployed at the previous step, by setting the
stackRef
variable, i.e.:
const stackRef = new pulumi.StackReference(`tacoinfra/tf-teztnets-infra/prod`);
Deployment of teztnets requires a few secrets:
- teztnet's embedded bakers private key
- recaptcha keys for faucet
We are storing these secrets as Pulumi encrypted configuration values.
These secrets are local to the stack. When creating a new stack, they need to be repopulated. For example:
pulumi config set --secret private-teztnets-baking-key edsk...
Be wary of double baking: if you are creating a duplicate of the infrastructure while the old cluster is still up, your old nodes and new nodes will peer with each other and double bake! While this is not a big deal on testnets, it is advisable to shut down the old infrastructure before turning on the new one. You may also deploy alternative baking keys, but this requires a search-and-replace in the repo for the associated public key hash, otherwise you may have issues activating the chains.
You also need to set the gcp:project
and gcp:region
non-secret config variables. For example:
pulumi config set gcp:project tf-teztnets
pulumi config set gcp:reion us-central1
- perform
pulumi up
- when asked to create a stack, create an organization stack (for example,
tacoinfra/teztnets/prod
), so different team members can collaborate on it.
After a few minutes, the following happens:
- pods are being created on the cluster
- a few more things happen outside of Kubernetes:
- a DNS zone is created on GCP
- static IPs are booked for DAL
- ingresses (visible on k9s with
:ingresses
) are not configured until the DNS resolves
Go to your registrar website and point the nameservers to the Google Cloud list, as visible in the Zones page.
After a few hours, the p2p, rpc endpoints will start being reachable.
cert-manager uses DNS TXT records to verify domain ownership. Therefore, when DNS is configured properly, the certificates for RPC endpoints will get generated after some time.
The Teztnets public repository is configured with Pulumi Github Actions. This means, to deploy infrastructure, it is sufficient to push to the master branch: a github action will spin up and deploy the infrastructure changes for you.
While the infrastructure repo was deployed directly by operator accounts with full access to the Gcloud project, the Teztnets stack is deployed with a service account with limited scope.
This requires the manual creation of a json key:
- In the Google Cloud Console, navigate to the "IAM & Admin" section.
- Select "Service Accounts" from the sidebar.
- Click on the service account
teztnets-ci-service-account
- Go to the "Keys" tab.
- Click “Add Key” and choose “Create new key”.
- Select “JSON” as the key type.
- Click “Create”.
- Download the key
Then add this key to the Github Secrets for the teztnets repo as GCP_CREDENTIALS
.
Optionally, add this key to pulumi secrets: pulumi config set --secret gcp:credentials <key>
. Then, when running the CI manually, you will also use the service account.
In app.pulumi.com, navigate to "Personal Access Tokens". Create a token, then add it to Github Secrets as PULUMI_ACCESS_TOKEN
.
Even with a Team plan, it is not possible to create a "pulumi service account" - this requires the Enterprise plan. Every token is associated with the user, and shall a given user leave the org, a new token needs to be created.
Any other secret is directly managed by pulumi, and committed encrypted into the repo.
Actions are defined in .github/workflows
. A pull request from the same repo will trigger pulumi preview
, a push to main will trigger a pulumi up
.
Dailynet/Weeklynet is deployed with a Cron action directly from github actions.
The pulumi code is aware of the date and time. Weeklynet/dailynet k8s namespaces contain the date in their name e.g. weeklynet-2023-12-06
. As a consequence, whenever the date changes, the new namespace is being created, and the old namespace is being deleted. This ensures absence of conflicts between old and new. This also ensures that the old infra is only deleted after the new one has completed succesfully. Otherwise, the other infra is still available pending human intervention, and slack will alert.