This repository contains NixOS configurations for the Ghaf CI/CD infrastructure.
This repository defines flakes-based NixOS configurations for the following targets:
- ghafhydra - Hydra with pre-configured jobset for Ghaf:
- Hydra: declaratively configured with Ghaf flake jobset, building on localhost.
- Binary cache: using nix-serve-ng signing packages that can be verified with public key:
cache.ghafhydra:XQx1U4555ZzfCCQOZAjOKKPTavumCMbRNd3TJt/NzbU=
. - Automatic nix store garbage collection: when free disk space in
/nix/store
drops below threshold value automatically remove garbage. - Pre-defined users: allow ssh access for a set of users based on ssh public keys.
- Secrets: uses sops-nix to manage secrets - secrets, such as hydra admin password and binary cache signing key, are stored encrypted based on host ssh key.
- Openssh server with pre-defined host ssh key. Server private key is stored encrypted as sops secret and automatically deployed on host installation.
Important:
The configuration files in this repository declaratively define the system configuration for all hosts in the Ghaf CI/CD infrastructure. That is, all system configurations - including the secrets - are stored and version controlled in this repository, no additional manual configuration is required. Indeed, all the hosts in the infrastructure might be reinstalled without further notice, so do not assume that anything outside the configurations defined in this repository would be available in the hosts. This includes the administrator's home directories: do not keep any important data in your home, since the contents of /home
will be regularly deleted.
If you still don't have nix package manager on your local host, install it following the package manager installation instructions from https://nixos.org/download.html.
Then, clone this repository:
$ git clone https://github.com/tiiuae/ghaf-infra.git
$ cd ghaf-infra
All example commands in this document are executed from nix-shell in the root path of your local copy of this repository. Run the following commands to start a nix-shell:
# Start nix-shell
$ nix-shell
Inspired by nix-community infra, this project makes use of pyinvoke to help with deployment tasks.
Run the following command to list the available tasks:
$ invoke --list
Available tasks:
alias-list List available targets (i.e. configurations and alias names)
build-local Build NixOS configuration `alias` locally.
deploy Deploy the configuration for `alias`.
install Install `alias` configuration using nixos-anywhere, deploying host private key.
pre-push Run 'pre-push' checks: black, pylint, pycodestyle, reuse lint, nix fmt.
print-keys Decrypt host private key, print ssh and age public keys for `alias` config.
reboot Reboot host identified as `alias`.
update-sops-files Update all sops yaml and json files according to .sops.yaml rules.
In the following sections, we will explain the intended usage of the most common above deployment tasks.
The alias-list
task lists the alias names for ghaf-infra targets. Alias is simply a name given for the combination of nixosConfig and hostname. All ghaf-infra tasks that need to identify a target, accept an alias name as an argument.
$ invoke alias-list
Current ghaf-infra targets:
╒═══════════════╤═══════════════╤══════════════╕
│ alias │ nixosconfig │ hostname │
╞═══════════════╪═══════════════╪══════════════╡
│ ghafhydra-dev │ ghafhydra │ 51.12.56.79 │
╘═══════════════╧═══════════════╧══════════════╛
In case hostname
is not directly accessible for your current $USER
, use ~/.ssh/config
to specify the ssh connection details such as username, port, or key file used to access the specific host.
As an example, to access host 51.12.56.79
with a specific username and key, you would add the following to ~/.ssh/config
:
$ cat ~/.ssh/config
Host 51.12.56.79
HostName 51.12.56.79
User my_remote_user_name
IdentityFile /path/to/my/private_key
Since task.py
internally uses ssh when accessing hosts, the above example configuration would be applied when accessing the ghafhydra-dev
alias.
The build-local
task builds the given alias configuration locally. If the alias name is not specified build-local
builds all alias configurations:
$ invoke build-local
INFO Running: nixos-rebuild build --option accept-flake-config true -v --flake .#ghafhydra
...
building '/nix/store/m0z520c0rpz1qjjw391srjw50426626z-etc.drv'...
building '/nix/store/7jx57i82zmkcjsimb761vqsdcx2sc8yq-nixos-system-ghafhydra-23.05.20231021.5550a85.drv'...
The pre-push
task runs a set of checks for the contents of this repository. The checks include: python linters, license compliance checks, formatting checks for nix and terraform files and nix flake check for the ghaf-infra flake. The pre-push
task also locally builds all the alias configurations:
$ invoke pre-push
INFO Running: find . -type f -name *.py ! -path *result* ! -path *eggs*
INFO Running: black -q ./tasks.py
INFO Running: pylint --disable duplicate-code -rn ./tasks.py
INFO Running: pycodestyle --max-line-length=90 ./tasks.py
INFO Running: reuse lint
INFO Running: terraform fmt -check -recursive
INFO Running: nix fmt
INFO Running: nix flake check -v
...
INFO All pre-push checks passed
The install
task installs the given alias configuration on the target host with nixos-anywhere. It will automatically partition and re-format the host hard drive, meaning all data on the target will be completely overwritten with no option to rollback. During installation, it will also decrypt and deploy the host private key from the sops secrets. The intended use of the install
task is to install NixOS configuration on a non-NixOS host, or to repurpose an existing server.
Note: ìnstall
task assumes the given NixOS configuration is compatible with the specified host. In the existing Ghaf CI/CD infrastructure you can safely assume this holds true. However, if you plan to apply the NixOS configurations from this repository on a new infrastructure or onboard new hosts, please read the documentation in adapting-to-new-environments.md.
$ invoke install --alias ghafhydra-dev
Install configuration 'ghafhydra' on host '51.12.50.33'? [y/N] y
...
### Uploading install SSH keys ###
### Gathering machine facts ###
### Switching system into kexec ###
### Formatting hard drive with disko ###
### Uploading the system closure ###
### Copying extra files ###
### Installing NixOS ###
### Waiting for the machine to become reachable again ###
### Done! ###
...
The deploy
task deploys the given alias configuration to the target host with nixos-rebuild switch
subcommand. This task assumes the target host is already running NixOS, and fails if it's not.
Note: unlike the changes made with install
task, deploy
changes can be reverted with nixos-rebuild switch --rollback
or similar.
$ invoke deploy --alias ghafhydra-dev
[51.12.50.33] $ nix flake archive --to ssh://51.12.50.33 --json
[51.12.50.33] copying path '/nix/store/dbppismymjc6382g4v6d6sb99pjby37b-source' from 'https://cache.vedenemo.dev'...
[51.12.50.33] copying path '/nix/store/r2ip1850igy8kciyaagw502s3c6ph1s4-source' to 'ssh://51.12.50.33'...
[51.12.50.33] copying path '/nix/store/yj1wxm9hh8610iyzqnz75kvs6xl8j3my-source' to 'ssh://51.12.50.33'...
[51.12.50.33] $ sudo nixos-rebuild switch --option accept-flake-config true --flake /nix/store/1y4kqqi8xbw4ic96ahhhjgl61p61lvdg-source#ghafhydra
...
The update-sops-files
task updates all sops yaml and json files according to the rules in .sops.yaml
. The intended use is to update the secrets after adding new hosts, admins, or secrets:
$ invoke update-sops-files
2023/10/23 08:37:34 Syncing keys for file ghaf-infra/hosts/ghafhydra/secrets.yaml
2023/10/23 08:37:34 File ghaf-infra/hosts/ghafhydra/secrets.yaml already up to date
First, update the flake:
$ nix flake update
...
• Updated input 'nixpkgs':
'github:nixos/nixpkgs/898cb2064b6e98b8c5499f37e81adbdf2925f7c5' (2023-10-13)
→ 'github:nixos/nixpkgs/5550a85a087c04ddcace7f892b0bdc9d8bb080c8' (2023-10-21)
...
Then, deploy the updated configuration to the target host(s):
$ invoke deploy --alias ghafhydra-dev
Notice: be sure to manually verify the target services work as expected after the update. Also, make sure the install
task still works after the flake update by running the invoke install alias-name-here
against a test (dev) configuration.
Onboarding new admins requires the following manual steps:
- Add their user and ssh key to users and import the user on the hosts they need access to.
- Add their age key to .sops.yaml, update the
creation_rules
, and run theupdate-sops-files
task. - Deploy the new configuration to changed hosts.
For deployment secrets (such as the binary cache signing key), this project uses sops-nix.
The general idea is: each host have secrets.yaml
file that contains the encrypted secrets required by that host. As an example, the secrets.yaml
file for the host ghafhydra defines a secret cache-sig-key
which is used by the host ghafhydra in its binary cache configuration to sign packages in the nix binary cache. All secrets in secrets.yaml
can be decrypted with each host's ssh key - sops automatically decrypts the host secrets when the system activates (i.e. on boot or whenever nixos-rebuild switch occurs) and places the decrypted secrets in the configured file paths. An admin user manages the secrets by using the sops
command line tool.
Each host's private ssh key is stored as sops secret and automatically deployed on host installation.
secrets.yaml
files are created and edited with the sops
utility. The .sops.yaml
file tells sops what secrets get encrypted with what keys.
The secrets configuration and the usage of sops
is adopted from nix-community infra project.
This project is licensed under the Apache-2.0 license - see the Apache-2.0.txt file for details.