Borrowing heavily from the original Kubernetes the Hard Way guide by Kelsey Hightower, this tutorial will walk you through the steps of deploying a Kubernetes cluster in an IBM Cloud VPC using Terraform, Ansible and some CLI magic.
The original guide is a great way to get started with Kubernetes and understand the various components that make up the cluster, however, it is a bit dated and uses Google Cloud Platform (GCP) as the cloud provider. I enjoy all things automation, so I wanted to take a stab at a more automated approach, while still covering the various components and steps required to bootstrap your a cluster the hard way.
The guide is broken down into the following steps:
- Deploy VPC, network, and compute resources for a kubernetes cluster (3 control plane nodes and 3 worker nodes, private DNS load balancer, security groups, etc.)
- Generate the Kubernetes certificates and kubeconfig files using
cfssl
andkubectl
- Bootstrap the Kubernetes control plane nodes and etcd cluster
- Bootstrap the Kubernetes worker nodes and bootstrap them to the control plane
- Install DNS Add-on and run some basic
smoke tests
against the cluster (deployments
,services
,pods
, etc.`)
The project is broken down into the following directories:
010-vpc-infrastructure
- Terraform code to deploy the IBM VPC and associated networking components.020-vpc-compute
- Terraform code to deploy the compute resources in the VPC. This covers the control plane and worker nodes as well as a bastion host for accessing the cluster and running our Ansible playbooks.030-certificate-authority
- Use cfssl to generate the certificate authority and client certificates for the cluster components, kubeconfig files and the Kubernetes API server certificate.040-configure-systems
- Ansible playbooks to deploy the Kubernetes control plane and worker nodes.
- Clone the repository
git clone https://github.com/cloud-design-dev/Kubernetes-the-slightly-difficult-way.git
cd Kubernetes-the-slightly-difficult-way
- Copy
template.local.env
tolocal.env
:
cp template.local.env local.env
- Edit
local.env
to match your environment.
In this first step we will deploy our IBM Cloud VPC, a Public Gateway, a VPC Subnet, and a Security Group for our cluster.
source local.env
(cd 010-vpc-infrastructure && ./main.sh apply)
When prompted, enter yes
to confirm the deployment. When the deployment completes you should see output similar to the following:
bastion_security_group_id = "r038-48afbe99-xxxxx"
cluster_security_group_id = "r038-cde50195-xxxxx"
resource_group_id = "ac83304bxxxxx"
subnet_id = "02q7-f36019a5-7035-xxxxx"
vpc_crn = "crn:v1:bluemix:public:is:ca-tor:a/xxxxx::vpc:xxxxx"
vpc_default_routing_table_id = "r038-07087206-2052-xxxxx"
vpc_id = "r038-3542898c-xxxxx"
With our VPC stood up, we can now deploy the compute resources for our cluster. This includes the control plane and worker nodes as well as a bastion host for accessing the cluster and running our Ansible playbooks. We will also stand up an instance of the Private DNS Service and create a Private DNS Global Load Balancer for our Kubernetes API servers.
(cd 020-vpc-compute && ./main.sh apply)
When prompted, enter yes
to confirm the deployment. When the deployment completes you should see output similar to the following:
Outputs:
controller_name = [
"controller-0",
"controller-1",
"controller-2",
]
controller_private_ip = [
"10.249.0.4",
"10.249.0.5",
"10.249.0.7",
]
loadbalancer_fqdn = "api.k8srt.lab"
pdns_domain = "k8srt.lab"
worker_name = [
"worker-0",
"worker-1",
"worker-2",
]
workers_private_ip = [
"10.249.0.9",
"10.249.0.8",
"10.249.0.10",
]
We now have our networking and compute resources deployed, so we can move on to generating the certificates and kubeconfig files for our cluster. We will use cfssl
to generate the certificate authority and client certificates for the cluster components (etcd
, apiserver
, etc), kubeconfig files and the Kubernetes API server certificate. This terraform run does not contain any output, so as long as it completes, you can move on to running the Ansible playbooks.
(cd 030-certificate-authority && ./main.sh apply)
The first playbokk we will run will use the Ansible ping
module to test connectivity to all of the hosts in our inventory file. This will also add the SSH host keys to our known_hosts
file.
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/ping-all.yml
The update-systems.yml
playbook does the following:
- Updates the
/etc/hosts
file on each instance with all the worker and control plane nodes - Runs an
apt-get
update and upgrade of any existing packages. - Installs
socat, conntrack, and ipset
on the worker nodes - Creates the required directories on each host for etcd, kubernetes, and the container runtime components
- Disables swap on the worker nodes and writes the change to
/etc/fstab
- Reboots all control plane and worker hosts to ensure they are running the latest kernel
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/update-systems.yml
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/controllers-etcd.yaml
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/controllers-kubes.yaml
not yet implemented
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/check-control-plane.yaml
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/workers-kubes.yaml
ansible-playbook -i 040-configure-systems/inventory.ini 040-configure-systems/playbooks/check-cluster.yaml
If all of the pieces lined up, you should see output similar to this:
ok: [bastion-host] => {
"msg": [
"NAME STATUS ROLES AGE VERSION",
"worker-0 Ready <none> 23s v1.27.6",
"worker-1 Ready <none> 23s v1.27.6",
"worker-2 Ready <none> 23s v1.27.6"
]
}
not yet implemented