NOTE: work in progress - Kubernetes networking heavily relies on Windows HNS which is still unstable.
Ansible playbooks and Packer templates for provisioning of Hyper-V Vagrant boxes and configuration of hybrid Kubernetes 1.10+ cluster with Flannel network (host-gw backend). Currently supports:
- Windows Server 1803 (March 2018) as Kubernetes nodes with Docker 17.10.0-ee-preview-3.
- Ubuntu 16.04 LTS (Xenial) as Kubernetes master and nodes with Docker 17.03.
- Cluster initialization using kubeadm (both Windows and Linux nodes).
- Flannel pod network, host-gw backend (vxlan can be also installed, requires changes in network deployment file and installation of CNI plugins on Windows). Based on flannel-io/flannel#921 and containernetworking/plugins#85 by rakelkar.
- Configurable pod/service CIDRs.
- Deployment of Microsoft SDN github repository do Windows nodes in order to make debugging easier.
- Exposing NodePort services on both Windows and Linux nodes.
- Packer templates with Ansible support which can be executed on Windows hosts, thanks to Powershell wrappers for Ansible commands (ptylenda/ansible-for-windows-wsl-powershell-fall-creators-update). Similar solution was used in demo Packer template in ptylenda/packer-template-ubuntu1604-ansible-proxy.
- Provisioning and configuration behind a proxy.
- Workaround for OneGet/MicrosoftDockerProvider#15 by pablodav.
Ansible playbooks provided in this repository have been initially based on the official Microsoft Getting Started guide for Kubernetes on Windows. Most of the original scripts in this guide have been replaced by Ansible tasks and NSSM windows services, and on top of that experimental Flannel support with host-gw backend has been added.
The original kubernetes-for-windows was modified by @pablodav trying to reuse more parts from external projects to deploy k8s on linux and focusing this project only on the required parts to add a windows node to the existing kubernetes cluster. For that reason @pablodav have removed most parts of k8s for linux and started to deploy k8s with kubespray then adding this project as integrated part of that to deploy win_node, but the modifications are done thinking in the possibility to integrate with any other k8s project that uses kubeadm (this project uses kubeadm to generate tokens to join the win_node).
- I have not managed to make kube-proxy working in kernelspace proxy-mode (for more information concerning debugging please check my messages on #sig-windows on Kubernetes Slack. As a workaround, userspace proxy-mode is used.
- Flannel network (host-gw backend) is highly experimental. The following communication schemes have been validated:
- Pod-to-pod internal communication between Windows pods using IP.
- Pod-to-pod internal communication between Windows pods using DNS.
- Pod-to-pod internal communication between Windows and Linux pods using IP.
- Pod-to-pod internal communication between Windows and Linux pods using DNS.
- Pod-to-pod internal communication between Linux pods using IP.
- Pod-to-pod internal communication between Linux pods using DNS.
- Pod-to-service internal communication between Linux pod and Windows service using IP.
- Pod-to-service internal communication between Linux pod and Windows service using DNS.
- Pod-to-service internal communication between Windows pod and Linux service using IP.
- Pod-to-service internal communication between Windows pod and Linux service using DNS.
- External communication to NodePort service via Windows node.
- External communication to NodePort service via Linux node.
- Communication with external IPs (i.e. outbound NAT) from Linux pods.
- Communication with external IPs (i.e. outbound NAT) from Windows pods - this is the most significant issue, with current Windows HNS and Hyper-V Virtual Switch it is not possible to achieve outbound NAT without losing pod-to-pod communication from Windows nodes.
- There are problems with automatic configuration of DNS in Windows pods (depends on Windows version). Some workarounds have been posted in this azure-acs-engine issue.
- It is not possible to use Ansible Remote provisioner with Ansible 2.5.0 and Packer 1.2.2 for Windows nodes due to the following exception:
ntlm: HTTPSConnectionPool(host='127.0.0.1', port=63008): Max retries exceeded with url: /wsman (Caused by SSLError(SSLError(1, u'[SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:590)'),))
Similar issues are present with Ubuntu templates:
SSH Error: data could not be sent to remote host \"127.0.0.1\". Make sure this host can be reached over ssh
Unfortunately I did not have time to investigate this issue yet, but the Packer provisioning process used to work on lower versions of Ansible and Packer.
- The playbooks have been also tested with latest Windows Server Insider Program builds but it requires altering playbooks, so that proper base container image for Kubernetes infra container is downloaded (insider and non-insider container images are not compatible).
For a quickstart you can see: https://github.com/pablodav/kubernetes-for-windows-quickstart
Ansible only:
- Windows 10 Fall Creators update (1709) as Hyper-V host and Ansible master.
- Ubuntu for Windows (WSL) installed.
- Ansible 2.5.0+ installed on Ubuntu for Windows (pip installation recommended).
- Additional python packages installed for WinRM (follow Ansible Windows Setup Guide).
- Windows Server 1803 (March 2018) installed on Windows Kubernetes nodes.
- WinRM properly configured on Windows Kubernetes nodes.
- Ubuntu 16.04 LTS (Xenial) installed on Linux master and nodes.
+ Packer:
- Packer 1.2.2+ installed on Windows host (visible in PATH).
- Windows Server 1709 (Jan 2018) ISO downloaded.
- Ubuntu 16.04 LTS (Xenial) ISO downloaded.
- To use this script to enable https port 5986, run the following in PowerShell:
$url = "https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1"
$file = "$env:temp\ConfigureRemotingForAnsible.ps1"
(New-Object -TypeName System.Net.WebClient).DownloadFile($url, $file)
powershell.exe -ExecutionPolicy ByPass -File $file
winrm enumerate winrm/config/Listener
- Install chocolatey package manager to avoid error while running ansible-playbook
Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
Windows has limited support for different pod networks, as mentioned in official Kubernetes Windows guide. Flannel with host-gw (and vxlan) backends and appropriate CNI plugins are currently in experimental stage and are available as the following pull requests by rakelkar: flannel-io/flannel#921 containernetworking/plugins#85 Unfortunately, these original pull requests had some minor issues which had to be fixed or worked around:
- When kubeproxy is running in userspace mode, it will attach multiple interfaces for services to cbr0 interface. This results in Flannel being confused and not identifying cbr0 interface properly. This has been addressed by rakelkar/flannel#8.
- clusterNetworkPrefix and endpointMacPrefix JSON properties are not parsed properly by CNI plugins. Workaround is available as a part of rakelkar/plugins#5, but there is also a cleaner solution provided here: containernetworking/plugins#85 (comment)
- rakelkar/plugins#5 attempts at working around Outbound NAT creation problems in HNS. This part is troublesome, as it disables pod-to-pod communication on Windows nodes. This should probably be removed.
First, install all prerequisites from the previous paragraph. Ensure that a basic Ansible playbook is working properly when using Ubuntu for Windows (WSL) for both Windows and Linux nodes.
A sample inventory file has been provided in Ansible playbook directory. Assuming that you would like to create cluster having the following hosts:
- Master node: ubuntu01, ubuntu02
- Linux nodes: ubuntu02, ubuntu03
- Windows nodes: windows01, windows02
your inventory should be defined as:
# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
# node1 ansible_host=95.54.0.12 # ip=10.3.0.1
# node2 ansible_host=95.54.0.13 # ip=10.3.0.2
# node3 ansible_host=95.54.0.14 # ip=10.3.0.3
# node4 ansible_host=95.54.0.15 # ip=10.3.0.4
# node5 ansible_host=95.54.0.16 # ip=10.3.0.5
# node6 ansible_host=95.54.0.17 # ip=10.3.0.6
# This inventory is based on an integrated inventory with kubespray
# the kube-master, kube-node, etcd, kube-ingres, k8s-cluster comes from kubespray
# https://github.com/kubernetes-incubator/kubespray/blob/master/docs/integration.md
# https://github.com/kubernetes-incubator/kubespray/blob/master/inventory/sample/hosts.ini
[Location1]
ubuntu01 ansible_host=10.3.0.1
ubuntu02 ansible_host=10.3.0.2
ubuntu03 ansible_host=10.3.0.3
# ## configure a bastion host if your nodes are not directly reachable
# bastion ansible_host=x.x.x.x ansible_user=some_user
[kube-master]
ubuntu01
ubuntu02
[etcd]
ubuntu01
ubuntu02
ubuntu03
[kube-node]
ubuntu02
ubuntu03
[kube-ingress]
ubuntu02
ubuntu03
[k8s-cluster:children]
kube-master
kube-node
kube-ingress
# Add group of master-ubuntu for kubernetes-for-windows project
[master-ubuntu:children]
kube-master
[node-windows]
windows01 kubernetes_node_hostname=windows01 ansible_host=10.3.0.4
windows02 kubernetes_node_hostname=windows02 ansible_host=10.3.0.5
[node:children]
node-windows
# This k8s-cluster-local is a group added to add all variables inside that group
# all variables for kubespray and for kubernetes-for-windows projects in your group_vars inventory
[k8s-cluster-local:children]
k8s-cluster
node-windows
[all-ubuntu:children]
master-ubuntu
The host variable kubernetes_node_hostname
will be used as Windows hostname and at the same time it will be used to identify node in Kubernetes.
Any other variables that you wish to configure are available in group_vars you must copy these vars to your own group_vars dir, for example cluster/service pod CIDRs.
You will find more vars for kubespray roles in k8s-cluster-local, the vars added here already have flannel and are integrated with vars for kubernetes-for-windows.
We are using steps from kubespray integration
First you must have an ansible git repo, if not init one:
git init
Then copy these files to your inventory:
├── ansible.cfg # It will have the roles_path and library path you can read and add to your own ansible.cfg file
├── inventory # For all files in inventory you have a sample in this project
│ ├── kubernetes.ini
│ ├── group_vars
│ │ ├── k8s-cluster-local # Directory with vars for group k8s-cluster-local
│ │ │ ├── all-k8s.yml
│ │ │ ├── k8s-cluster.yml
│ │ │ └── k8s-win.yml
├── roles.kubernetes.yml # It will have all kubespray and kubernetes-for-windows import playbooks
Also create directory:
mkdir -p roles/3d
Then add as submodules kubespray and kubernetes-for-windows to use these roles:
git submodule add https://github.com/kubernetes-incubator/kubespray.git roles/3d/kubespray
git submodule add https://github.com/pablodav/kubernetes-for-windows roles/3d/kubernetes-for-windows # hope in future we can agree with main author of it to use his repository https://github.com/ptylenda/kubernetes-for-windows
You must have now:
ls -1 roles/3d/
kubernetes-for-windows
kubespray
Now you have all the roles and variables ready, just install the requirements from kubespray:
sudo pip install -r roles/3d/kubespray/requirements.txt
In order to install basic Kubernetes packages on Windows and Linux nodes, run roles.kubernetes.yml
playbook:
ansible-playbook roles.kubernetes.yml -i inventory/kubernetes.ini -b -v
You can also install only linux with kubespray:
ansible-playbook roles.kubernetes.yml -i inventory/kubernetes.ini --tags role::kubespray -b -vvv
And then the windows nodes:
ansible-playbook roles.kubernetes.yml -i inventory/kubernetes.ini --tags role::kubernetes-for-windows -b -vvv
You will notice that the role with tag role::kubernetes-for-windows
will patch kube-proxy and kube-flannel:
kubectl get ds -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-flannel 3 3 3 3 3 beta.kubernetes.io/os=linux 34m
kube-proxy 3 3 3 3 3 beta.kubernetes.io/os=linux 8d
So the daemonset will not be deployed in windows.
The playbook consists of the following stages:
- Installation of common modules on Windows and Linux. This includes various packages required by Kubernetes and recommended configuration, setting up proxy variables, updating OS, changing hostname. (For linux it installs from kubespray modules)
- Docker installation. On Windows, at this stage, appropriate Docker images are being pulled and tagged for compatibility reasons, as mentioned in official Microsoft guide.
- CNI plugins installation. On Windows, custom plugins are downloaded based on this repository which is a fork of containernetworking/plugins#85 by rakelkar.
- Kubernetes packages installation (currently 1.10). On Windows, there are also NSSM services created for kubelet and kube-proxy (in 1.10 it should be possible to create the services natively, however it has not been tested here)
- On Windows, custom Flannel is installed based on this repository which is a fork of flannel-io/flannel#921 by rakelkar
Filtering by init
tag omits installation of Kubernetes packages which have been already installed in Step 3.
Installation consists of the following stages:
- All installation is performed with kubespray kubeadm vars are selected in our sample
- Kube config is copied to current user's HOME.
- RBAC role and role binding for standalone kube-proxy service is applied. It is required for Windows, which does not host kube-proxy as Kubernetes pod, i.e. it is hosted as a traditional system service.
- An additional node selector for kube-proxy is applied. Node selector ensures that kube-proxy daemonset is only deployed to Linux nodes. On Windows it is not supported yet, hence the standalone system service.
- Flannel network node selector
beta.kubernetes.io/os: linux
has been added in order to prevent from deploying on Windows nodes, where Flannel is being handled independently.
- all done with kubespray
- Node joins cluster using kubeadm and token generated on master. Works exactly the same as for Linux nodes.
- Old Flannel routes/networks are deleted, if present. HNS endpoints/policies are deleted, all the operations are performed by kube-cleanup.ps1 script.
- Flannel, kubelet and kube-proxy are started (and restarted) in a specific sequence. Flannel is not fully operational during the first execution and it requires restarting (probably a bug).
Right now the cluster is fully functional, you can proceed with deploying an example Windows service.
The playbook is idempotent, so you can add more nodes to an existing cluster just by extending inventory and rerunning the playbook.
win-webserver.yml contains an example Deployment and Service based on Microsoft guide. In order to deploy the web service execute:
kubectl apply -f win-webserver.yaml
This will also deploy a Kubernetes NodePort Service which can be accessed on every Kubernetes node, both Windows and Linux, for example:
If you wish to tear down the cluster, without destroying the master node, execute the following playbook:
ansible-playbook -i inventory reset-kubeadm-cluster.yml
This will reset only windows node, to reset linux cluster follow the kubespray steps. Remove nodes
Actually it needs review, to integrate kubespray so it is not working as it was working before.
For easier usage of Packer with Ansible Remote provisoner on Windows, an additional wrapper script has been provided: packer-ansible-windows.ps1. Basically it adds Ansible wrappers to PATH and ensures that proxy settings are configured properly.
ISO files are expected to be loaded from ./iso
subdirectory in packer build space. For Ubuntu it is also possible to download the image automatically from the official http server. For Windows you have to provide your own copy of Windows Server 1803 ISO.
To build Windows node on Hyper-V:
$ .\packer-ansible-windows.ps1 build --only=hyperv-iso .\kubernetes-node-windows1803-march2018.json
Default user: ubuntu
Default password: ubuntu
(configurable in template variables)
To build Ubuntu node/master on Hyper-V:
$ .\packer-ansible-windows.ps1 build --only=hyperv-iso .\kubernetes-node-ubuntu1604.json
Default user: Administrator
Default password: password
(configurable in template variables AND in http\Autounattend.xml file, which does not support templating)
This template has been created in order to resolve problems with provisioning Ubuntu Server 16.04 behind a proxy. Keep in mind that:
- If you need to configure apt-get proxy from Packer template, you cannot use
choose-mirror-bin mirror/http/proxy string addr
. It is not possible to customizeaddr
in this case. - Using
choose-mirror-bin mirror/http/proxy string addr
in preseed.cfg has different impact compared to usingmirror/http/proxy=addr
from boot parameters. The latter also affects downloading preseed.cfg from http server (seems like a debconf bug). - Downloading preseed.cfg from
preseed/url
is sensitive to proxy settings inherited frommirror/http/proxy
(which seems contrary to description of this parameter). Fortunately I have discovered that settingno_proxy={{ .HTTPIP }}
environment variable from boot parameters is enough to force no proxy for wget in order to communicate with Packer http server. - There is a limitation for Boot Options length that can be used when installing Ubuntu using QEMU. This means that there may be not enough place to type all commands connected with keyboard settings when using proxy, but you can use
auto-install/enable=true
and feed them from preseed.cfg - For Hyper-V, if you would like to use Gen. 2 machines, you can't use floppies (https://technet.microsoft.com/en-us/library/dn282285(v=ws.11).aspx), therefore you have to stick to
preseed/url
method for providing preseed.cfg. - For Hyper-V, it is important to perform
d-i preseed/late_command string in-target apt-get install -y --install-recommends linux-virtual-lts-xenial linux-tools-virtual-lts-xenial linux-cloud-tools-virtual-lts-xenial;
, directly in preseed.cfg, BEFORE any provisioner runs. These packages are needed in order to discover IP address of VM properly so that Packer can connect via SSH. Otherwise it will be waiting for IP address forever, more details can be found in "Notes" in https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/supported-ubuntu-virtual-machines-on-hyper-v - For shell provisioners and propagation of proxy settings, use:
"environment_vars": [
"FTP_PROXY={{ user `ftp_proxy` }}",
"HTTPS_PROXY={{ user `https_proxy` }}",
"HTTP_PROXY={{ user `http_proxy` }}",
"NO_PROXY={{ user `no_proxy` }}",
"ftp_proxy={{ user `ftp_proxy` }}",
"http_proxy={{ user `http_proxy` }}",
"https_proxy={{ user `https_proxy` }}",
"no_proxy={{ user `no_proxy` }}"
]
- For ansible-local provisioner use:
"extra_arguments": [
"--extra-vars",
"{'\"http_proxy\":\"{{ user `http_proxy` }}\", \"https_proxy\":\"{{ user `https_proxy` }}\", \"no_proxy\":\"{{ user `no_proxy` }}\", \"ftp_proxy\":\"{{ user `ftp_proxy` }}\"}'"
]
Then handle these variables appropriately in playbook, set environment variables, etc.
- In case of ansible-local there are problems when specifying inventory_groups: even though connection type passed to ansible is "local", it gets ignored and regular SSH connection is used. This causes problems due to unauthorized key for passwordless login to localhost. As a workaround you have to specify inventory_file with ansible_connection specified explicitly, for example:
[linux]
127.0.0.1 ansible_connection=local
Exclude windows defender on docker path and exe files:
Add-MpPreference -ExclusionPath C:\ProgramData\docker\
set-MpPreference -ExclusionProcess "dockerd.exe, flanneld.exe, kube-proxy.exe, kubelet.exe"
Or exclude the docker path in your antivirus.
In case of doubt with windows defender, disable it temporarly:
Set-MpPreference -DisableRealtimeMonitoring $true
Service start order matters, in some tests I (pablodav) have confirmed that this order is required to get all network devices and IP addresses created during start:
- docker
- kubelet
- kube-proxy
- flanneld
For that reason I have added serialized dependencies on nssm service config on tasks.
If docker Install step fails when using DockerMsftProvider, see which ones are the available versions for docker:
Find-Package –providerName DockerMsftProvider –AllVersions
Then change variable win_docker_version
with correct one, example:
win_docker_version: "17.06.2-ee-16"
Basic proxy usage was added for docker install, but doesn't use auth, only uses:
win_choco_proxy_url: "http://proxy:port"
If something fails when not using proxy, try to empty this var or add is as bool
Use this model for on-premises: