Skip to content

Commit

Permalink
Added terraform-docs for better documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
keyvaann committed Dec 4, 2024
1 parent f31d567 commit 975ed1b
Show file tree
Hide file tree
Showing 7 changed files with 345 additions and 29 deletions.
12 changes: 12 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# EditorConfig is awesome: https://EditorConfig.org

# top-most EditorConfig file
root = true

# Unix-style newlines with a newline ending every file
[{cluster/**,config/**}]
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
indent_style = space
indent_size = 2
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* text=auto
45 changes: 45 additions & 0 deletions .terraform-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
formatter: "" # this is required

version: ""

# header-from: DOCS.md
footer-from: ""

recursive:
enabled: false
path: modules

sections:
hide: []
show: []

content: ""

output:
file: "README.md"
mode: replace
template: |-
{{ .Content }}
output-values:
enabled: false
from: ""

sort:
enabled: true
by: name

settings:
anchor: true
color: true
default: true
description: false
escape: true
hide-empty: true
html: true
indent: 2
lockfile: true
read-comments: true
required: true
sensitive: true
type: true
15 changes: 15 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
prepare:
@echo === Cluster ===
@echo Generate docs
@terraform-docs markdown table cluster
@echo Fixing the formatting
@cd cluster && terraform fmt
@echo Validating Terraform code
@cd cluster && terraform validate
@echo === Config ===
@echo Generate docs
@terraform-docs markdown table config
@echo Fixing the formatting
@cd config && terraform fmt
@echo Validating Terraform code
@cd config && terraform validate
76 changes: 47 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,25 @@
# RADAR-K8s-Infrastructure
This repository aims to provide [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code) templates for [RADAR-Kubernetes](https://github.com/RADAR-base/RADAR-Kubernetes) users who intend to deploy the platform to Kubernetes clusters supported by cloud providers such as [AWS](https://aws.amazon.com/eks/).

This repository aims to provide [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code) templates for [RADAR-Kubernetes](https://github.com/RADAR-base/RADAR-Kubernetes) users who intend to deploy the platform to Kubernetes clusters supported by cloud providers such as [AWS](https://aws.amazon.com/eks/).

---

[![Terraform validate](https://github.com/phidatalab/RADAR-K8s-Infrastructure/actions/workflows/cluster.yaml/badge.svg)](https://github.com/phidatalab/RADAR-K8s-Infrastructure/actions/workflows/cluster.yaml/badge.svg)
[![Terraform validate](https://github.com/phidatalab/RADAR-K8s-Infrastructure/actions/workflows/config.yaml/badge.svg)](https://github.com/phidatalab/RADAR-K8s-Infrastructure/actions/workflows/config.yaml/badge.svg)

# Dependencies

[Terraform](https://developer.hashicorp.com/terraform/downloads) >= 1.7.0, < 1.8.0<br>
[AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) >= 2.11

# Usage

It is recommended that you use RADAR-K8s-Infrastructure as a template and create your own IaC repository from it (starting with a private one probably). Make sure to customise enclosed templates to your needs before creating the desired infrastructure.

<img src="./image/use_this_template.png" alt="use this template" width="500" height="124">


## Configure credentials

```
export TF_VAR_AWS_REGION=$AWS_REGION
export TF_VAR_AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
Expand All @@ -26,48 +29,54 @@ export TF_VAR_AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN
```

## Workspaces

The definition of resources required for running RADAR-base components is located in the `cluster` directory, while other optional resources are defined in the `config` directory. Please treat each directory as a separate workspace and perform terraform operations individually. The `cluster` resources need to be created and made fully available before you proceed with the creation of the `config` ones.

To retain the user-specific configurations for future infrastructure updates, modify `terraform.tfvars` within the workspace and push the change to your repository. If needed, additional variables defined in `variables.tf` can also be included there.
| :information_source: Important Notice |
| :information_source: Important Notice |
|:----------------------------------------|
|As a best practice, never save raw values of secret variables in your repository. Instead, always encrypt them before committing. If your cluster is no longer in use, run `terraform destory` to delete all the associated resources and reduce your cloud spending. If you have resources created within `config`, run `terraform destory` in that directory before running the counterpart in `cluster`.|

## Create the infrastructure

```
cd cluster
```

```
# Initialise the working directory
terraform init
```

```
# Review the changes going to be made
# Review the changes going to be made
terraform plan
```

```
# Create/update the infrastructure
terraform apply --auto-approve
```

Created resources:
* VPC featuring both public and private subnets
* VPC endpoints for privately accessing AWS services
* Internet and NAT gateways
* EKS cluster with a default worker node group
* EKS coredns, kube-proxy, vpc-cni and aws-ebs-csi-driver addons
* EBS storage classes referenced by PVCs
* IRSAs for VPC CNI and EBS CSI controllers
* Initial EC2 instances launched with Spot capacity
* Default network ACLs and route tables
* KMS keys and CloudWatch log groups
* Essential IAM policies, roles, users and user groups for accessing aforementioned resources

- VPC featuring both public and private subnets
- VPC endpoints for privately accessing AWS services
- Internet and NAT gateways
- EKS cluster with a default worker node group
- EKS coredns, kube-proxy, vpc-cni and aws-ebs-csi-driver addons
- EBS storage classes referenced by PVCs
- IRSAs for VPC CNI and EBS CSI controllers
- Initial EC2 instances launched with Spot capacity
- Default network ACLs and route tables
- KMS keys and CloudWatch log groups
- Essential IAM policies, roles, users and user groups for accessing aforementioned resources

## Connect to and verify the cluster

```
# Make sure to use --region if the cluster is deployed in non-default region and --profile if the cluster is deployed in a non-default AWS account
aws eks update-kubeconfig --name [eks_cluster_name]
Expand All @@ -76,12 +85,15 @@ kubectl get pods -A
```

Once the infrastructure update is finished successfully, you can start deploying RADAR-base components to the newly created cluster by following the [Installation Guide](https://github.com/RADAR-base/RADAR-Kubernetes#installation). Before running `helmfile sync`, you will find it necessary to configure certain resource values which are required by `production.yaml` but only known post to infrastructure creation. We have exported the values of those resources and you can get them by simply running:

```
terraform output
```

You could also automate this value injection by implementing your own templating strategy to customise `production.yaml`

## Configure the cluster (optional)

N.B.: To get external DNS, Cert Manager and SMTP working via Route 53 (if chosen as your DNS service), you need to configure your registered top-level domain and its corresponding hosted zone ID via variable `domain_name` in [config/terraform.tfvars](./config/terraform.tfvars). Additionally, set `enable_route53` to `true`.

```
Expand All @@ -94,19 +106,25 @@ terraform apply --auto-approve
Optional resource creations are disabled by default. To enable the creation of a specific resource named `X`, navigate to [config/terraform.tfvars](./config/terraform.tfvars) and update the value of `enable_X` to `true` before applying the tempate.

Created resources (if all enabled):
* EIP allocated for the load balancer created by Ingress-NGINX
* Karpenter provisioner, the node template and the SQS interruption queue
* Metrics Server along with the Kubernetes Dashboard and the read-only user
* MSK cluster featuring Kafka brokers and zookeepers
* RDS instance running managementportal, appserver and rest_sources_auth databases
* Route53 zone and records accompanied by IRSAs for external DNS and Cert Manager
* S3 buckets for intermediate-output-storage, output-storage and velero-backups
* SES SMTP endpoint
* CloudWatch event rules and targets
* Essential IAM policies, roles, users for aforementioned resources

- EIP allocated for the load balancer created by Ingress-NGINX
- Karpenter provisioner, the node template and the SQS interruption queue
- Metrics Server along with the Kubernetes Dashboard and the read-only user
- MSK cluster featuring Kafka brokers and zookeepers
- RDS instance running managementportal, appserver and rest_sources_auth databases
- Route53 zone and records accompanied by IRSAs for external DNS and Cert Manager
- S3 buckets for intermediate-output-storage, output-storage and velero-backups
- SES SMTP endpoint
- CloudWatch event rules and targets
- Essential IAM policies, roles, users for aforementioned resources

## Contributing

Make sure to install [terraform-docs](https://github.com/terraform-docs/terraform-docs) and run `make prepare` before making a commit to make sure the documentation is up to date and the code is valid.

## Known limitations
* Since EBS has been chosen as the default storage, node groups will be created in a single AZ due to the mounting restriction.
* Sometimes Terraform tries to replace the existing MSK cluster while re-applying the templates even if there is no change on the cluster. Mitigate this with `terraform untaint aws_msk_cluster.msk_cluster`.
* Prior to `terraform destroy`, infrastructure resources created by pods/controllers and may not be visible to Terraform need to be deleted, e.g., nginx-ingress's NLB. A good practice is to always begin by running `helmfile destroy`.
* If Karpenter is used for node provisioning, ensure the nodes created by it are not lingering around before running `terraform destroy`.

- Since EBS has been chosen as the default storage, node groups will be created in a single AZ due to the mounting restriction.
- Sometimes Terraform tries to replace the existing MSK cluster while re-applying the templates even if there is no change on the cluster. Mitigate this with `terraform untaint aws_msk_cluster.msk_cluster`.
- Prior to `terraform destroy`, infrastructure resources created by pods/controllers and may not be visible to Terraform need to be deleted, e.g., nginx-ingress's NLB. A good practice is to always begin by running `helmfile destroy`.
- If Karpenter is used for node provisioning, ensure the nodes created by it are not lingering around before running `terraform destroy`.
83 changes: 83 additions & 0 deletions cluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.7.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 5.0.0, < 6.0.0 |
| <a name="requirement_kubectl"></a> [kubectl](#requirement\_kubectl) | ~> 1.14.0 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | ~> 2.24.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.80.0 |
| <a name="provider_kubectl"></a> [kubectl](#provider\_kubectl) | 1.14.0 |
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | 2.24.0 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_allow_assume_eks_admins_iam_policy"></a> [allow\_assume\_eks\_admins\_iam\_policy](#module\_allow\_assume\_eks\_admins\_iam\_policy) | terraform-aws-modules/iam/aws//modules/iam-policy | 5.15.0 |
| <a name="module_allow_eks_access_iam_policy"></a> [allow\_eks\_access\_iam\_policy](#module\_allow\_eks\_access\_iam\_policy) | terraform-aws-modules/iam/aws//modules/iam-policy | 5.15.0 |
| <a name="module_ebs_csi_irsa"></a> [ebs\_csi\_irsa](#module\_ebs\_csi\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.0 |
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | 19.13.1 |
| <a name="module_eks_admins_iam_role"></a> [eks\_admins\_iam\_role](#module\_eks\_admins\_iam\_role) | terraform-aws-modules/iam/aws//modules/iam-assumable-role | 5.15.0 |
| <a name="module_iam_user"></a> [iam\_user](#module\_iam\_user) | terraform-aws-modules/iam/aws//modules/iam-user | n/a |
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 5.0 |
| <a name="module_vpc_cni_irsa"></a> [vpc\_cni\_irsa](#module\_vpc\_cni\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.0 |

## Resources

| Name | Type |
|------|------|
| [aws_iam_policy.ecr_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.ecr_pull_through_cache](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.s3_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy_attachment.eks_admins_policy_attachment](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy_attachment) | resource |
| [aws_security_group.vpc_endpoint](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
| [aws_security_group_rule.vpc_endpoint_egress](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule) | resource |
| [aws_security_group_rule.vpc_endpoint_self_ingress](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule) | resource |
| [aws_vpc_endpoint.ecr](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint) | resource |
| [aws_vpc_endpoint.s3](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint) | resource |
| [aws_vpc_endpoint.sts](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint) | resource |
| [aws_vpc_security_group_ingress_rule.vpc_endpoints_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_security_group_ingress_rule) | resource |
| [kubectl_manifest.ebs_storage_classes](https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs/resources/manifest) | resource |
| [kubernetes_annotations.set_defaut_storage_class](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/annotations) | resource |
| [kubernetes_annotations.unset_eks_default_gp2](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/annotations) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_AWS_ACCESS_KEY_ID"></a> [AWS\_ACCESS\_KEY\_ID](#input\_AWS\_ACCESS\_KEY\_ID) | AWS access key associated with an IAM account | `string` | n/a | yes |
| <a name="input_AWS_REGION"></a> [AWS\_REGION](#input\_AWS\_REGION) | Target AWS region | `string` | `"eu-west-2"` | no |
| <a name="input_AWS_SECRET_ACCESS_KEY"></a> [AWS\_SECRET\_ACCESS\_KEY](#input\_AWS\_SECRET\_ACCESS\_KEY) | AWS secret key associated with the access key | `string` | n/a | yes |
| <a name="input_AWS_SESSION_TOKEN"></a> [AWS\_SESSION\_TOKEN](#input\_AWS\_SESSION\_TOKEN) | Session token for temporary security credentials from AWS STS | `string` | `""` | no |
| <a name="input_common_tags"></a> [common\_tags](#input\_common\_tags) | Common tags associated to resources created | `map(string)` | <pre>{<br> "Environment": "dev",<br> "Project": "radar-base"<br>}</pre> | no |
| <a name="input_create_dmz_node_group"></a> [create\_dmz\_node\_group](#input\_create\_dmz\_node\_group) | Whether or not to create a DMZ node group with taints | `bool` | `false` | no |
| <a name="input_defaut_storage_class"></a> [defaut\_storage\_class](#input\_defaut\_storage\_class) | Default storage class used for describing the EBS usage | `string` | `"radar-base-ebs-sc-gp2"` | no |
| <a name="input_dmz_node_size"></a> [dmz\_node\_size](#input\_dmz\_node\_size) | Node size of the DMZ node group | `map(number)` | <pre>{<br> "desired": 1,<br> "max": 2,<br> "min": 0<br>}</pre> | no |
| <a name="input_eks_admins_group_users"></a> [eks\_admins\_group\_users](#input\_eks\_admins\_group\_users) | EKS admin IAM user group | `list(string)` | `[]` | no |
| <a name="input_eks_cluster_name"></a> [eks\_cluster\_name](#input\_eks\_cluster\_name) | EKS cluster name | `string` | n/a | yes |
| <a name="input_eks_kubernetes_version"></a> [eks\_kubernetes\_version](#input\_eks\_kubernetes\_version) | Amazon EKS Kubernetes version | `string` | `"1.28"` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | Environment name | `string` | `"dev"` | no |
| <a name="input_instance_capacity_type"></a> [instance\_capacity\_type](#input\_instance\_capacity\_type) | Capacity type used by EKS managed node groups | `string` | `"SPOT"` | no |
| <a name="input_instance_types"></a> [instance\_types](#input\_instance\_types) | List of instance types used by EKS managed node groups | `list(any)` | <pre>[<br> "m5.large",<br> "m5d.large",<br> "m5a.large",<br> "m5ad.large",<br> "m4.large"<br>]</pre> | no |
| <a name="input_worker_node_size"></a> [worker\_node\_size](#input\_worker\_node\_size) | Node size of the worker node group | `map(number)` | <pre>{<br> "desired": 2,<br> "max": 10,<br> "min": 0<br>}</pre> | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_radar_base_ebs_storage_class_gp2"></a> [radar\_base\_ebs\_storage\_class\_gp2](#output\_radar\_base\_ebs\_storage\_class\_gp2) | n/a |
| <a name="output_radar_base_ebs_storage_class_gp3"></a> [radar\_base\_ebs\_storage\_class\_gp3](#output\_radar\_base\_ebs\_storage\_class\_gp3) | n/a |
| <a name="output_radar_base_ebs_storage_class_io1"></a> [radar\_base\_ebs\_storage\_class\_io1](#output\_radar\_base\_ebs\_storage\_class\_io1) | n/a |
| <a name="output_radar_base_ebs_storage_class_io2"></a> [radar\_base\_ebs\_storage\_class\_io2](#output\_radar\_base\_ebs\_storage\_class\_io2) | n/a |
| <a name="output_radar_base_eks_cluser_endpoint"></a> [radar\_base\_eks\_cluser\_endpoint](#output\_radar\_base\_eks\_cluser\_endpoint) | n/a |
| <a name="output_radar_base_eks_cluser_kms_key_arn"></a> [radar\_base\_eks\_cluser\_kms\_key\_arn](#output\_radar\_base\_eks\_cluser\_kms\_key\_arn) | n/a |
| <a name="output_radar_base_eks_cluster_name"></a> [radar\_base\_eks\_cluster\_name](#output\_radar\_base\_eks\_cluster\_name) | n/a |
| <a name="output_radar_base_eks_dmz_node_group_name"></a> [radar\_base\_eks\_dmz\_node\_group\_name](#output\_radar\_base\_eks\_dmz\_node\_group\_name) | n/a |
| <a name="output_radar_base_eks_worker_node_group_name"></a> [radar\_base\_eks\_worker\_node\_group\_name](#output\_radar\_base\_eks\_worker\_node\_group\_name) | n/a |
| <a name="output_radar_base_vpc_public_subnets"></a> [radar\_base\_vpc\_public\_subnets](#output\_radar\_base\_vpc\_public\_subnets) | n/a |
Loading

0 comments on commit 975ed1b

Please sign in to comment.