Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature azure #118

Open
wants to merge 69 commits into
base: 6
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
9104911
fix nginx playbook version to stay compatible with system ansible
hmeiland Oct 27, 2021
bbe0071
workaround gpg check
hmeiland Nov 15, 2021
dda92be
add mpi for azure
hmeiland Nov 15, 2021
f4b096b
chmod /mnt/shared, since default anf will have mode 700
hmeiland Nov 16, 2021
f4b9e7b
add citc_cloud for azure
hmeiland Nov 16, 2021
48f9380
add azure var
hmeiland Nov 16, 2021
4c155f5
tune elastic
hmeiland Nov 24, 2021
a8205d8
adding citc_azure
hmeiland Nov 24, 2021
4ca54bf
azure python tools
hmeiland Nov 25, 2021
aa768fa
azure python tools
hmeiland Nov 25, 2021
e6970fa
citc_azure
hmeiland Nov 25, 2021
eb48e81
citc_azure
hmeiland Nov 25, 2021
3637552
citc_azure
hmeiland Nov 25, 2021
9add1b1
citc_azure
hmeiland Nov 25, 2021
0be6764
citc_azure
hmeiland Nov 25, 2021
f7850df
citc_azure
hmeiland Nov 25, 2021
2471d1c
citc_azure
hmeiland Nov 26, 2021
e0e5c08
citc_azure
hmeiland Nov 26, 2021
2548611
update_config and citc_azure
hmeiland Nov 26, 2021
f7199d5
add user_data
hmeiland Nov 26, 2021
4a47e34
add user_data
hmeiland Nov 29, 2021
4d471ca
add user_data
hmeiland Nov 29, 2021
e587258
add user_data
hmeiland Nov 29, 2021
f6ec63e
add user_data
hmeiland Nov 29, 2021
4d41c59
add user_data
hmeiland Nov 29, 2021
367d5fe
add user_data
hmeiland Nov 29, 2021
0a6e767
add user_data
hmeiland Nov 29, 2021
4ef2dbf
add user_data
hmeiland Nov 29, 2021
4675ab6
packer
hmeiland Nov 29, 2021
7a2c352
packer
hmeiland Nov 29, 2021
f43c5e0
packer
hmeiland Nov 29, 2021
1bc5f6e
packer
hmeiland Nov 29, 2021
abe1cab
dns zone compute
hmeiland Nov 29, 2021
22bd6f8
packer
hmeiland Nov 30, 2021
4686f25
packer
hmeiland Nov 30, 2021
99076ed
packer
hmeiland Nov 30, 2021
4c3d792
packer
hmeiland Nov 30, 2021
cc304e8
packer
hmeiland Nov 30, 2021
393793a
packer
hmeiland Nov 30, 2021
dac6dfd
packer
hmeiland Nov 30, 2021
2eda4d1
autoscaling
hmeiland Dec 2, 2021
994a0aa
packer
hmeiland Dec 2, 2021
3b4a386
dns_zone
hmeiland Dec 3, 2021
d663eb9
fixing boot
hmeiland Dec 8, 2021
1d2a976
fixing boot
hmeiland Dec 8, 2021
ea9c739
azure
hmeiland Dec 8, 2021
fa396c3
anf
hmeiland Dec 8, 2021
d4bba71
packer
hmeiland Dec 8, 2021
4aef1b2
azure dns
hmeiland Dec 8, 2021
e23bc1b
azure dns
hmeiland Dec 8, 2021
9016609
azure dns
hmeiland Dec 8, 2021
96c5696
azure dns
hmeiland Dec 8, 2021
a6a726a
azure dns
hmeiland Dec 8, 2021
b190d9d
azure dns
hmeiland Dec 9, 2021
986e573
azure dns
hmeiland Dec 9, 2021
bcc1b61
azure dns
hmeiland Dec 9, 2021
1510b29
bootstrap and custom data
hmeiland Dec 10, 2021
a97bd9b
adding node delete
hmeiland Dec 10, 2021
0812dea
stop node
hmeiland Dec 10, 2021
4788f3e
stop node
hmeiland Dec 10, 2021
7710ba9
create vm based on shape
hmeiland Jan 4, 2022
814d8a7
create vm based on shape
hmeiland Jan 4, 2022
4c66c40
add avset
hmeiland Jan 4, 2022
4a218c2
add avset
hmeiland Jan 5, 2022
659a71f
add cvmfs
hmeiland Jan 6, 2022
66ae716
tuning cvmfs
hmeiland Jan 6, 2022
0ba08fe
Merge branch '6' into feature-azure
milliams Jan 13, 2022
ea3e906
enable rdma
hmeiland Jan 17, 2022
d0b59f1
Merge branch 'feature-azure' of github.com:hmeiland/ansible into feat…
hmeiland Jan 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion compute.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
- citc_user
- filesystem
- ssh
#- security_updates
##- security_updates
- ntp
- sssd
- lmod
Expand Down
3 changes: 3 additions & 0 deletions group_vars/compute.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ mpi_packages:
google:
- mpich
- openmpi
azure:
- mpich
- openmpi
aws: []

monitoring_role: client
Expand Down
5 changes: 5 additions & 0 deletions group_vars/management.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ slurm_role: mgmt
slurm_elastic:
oracle:
config_directory: /home/slurm/.oci/
azure:
config_directory: /home/slurm/.oci/

install_packages:
- xorg-x11-xauth
Expand All @@ -27,6 +29,9 @@ mpi_packages:
google:
- openmpi-devel
- mpich-devel
azure:
- openmpi-devel
- mpich-devel
aws: []

monitoring_role: master
Expand Down
13 changes: 12 additions & 1 deletion roles/filesystem/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,18 @@
opts: defaults,nofail,nosuid
state: mounted
when:
ansible_local.citc.csp != "aws"
- ansible_local.citc.csp != "aws"
- ansible_local.citc.csp != "azure"

- name: Mount shared file system now that fileserver is ready
mount:
path: /mnt/{{ filesystem_mount_point }}
src: "{{ filesystem_target_address }}:{{ filesystem_mount_point }}"
fstype: nfs
opts: rw,hard,rsize=1048576,wsize=1048576,vers=3,tcp,_netdev,noauto
state: mounted
when:
- ansible_local.citc.csp == "azure"

- name: Mount shared file system
mount:
Expand Down
6 changes: 6 additions & 0 deletions roles/finalise/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@
delay: 10
tags: packer

- name: update directory mode for the finalised files
file:
path: /mnt/shared
state: directory
mode: 0755

- name: create directory for the finalised files
file:
path: /mnt/shared/finalised
Expand Down
11 changes: 8 additions & 3 deletions roles/monitoring/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,16 @@
baseurl: https://repos.influxdata.com/centos/$releasever/{{ "arm64" if ansible_architecture == "aarch64" else ansible_architecture }}/stable/
gpgkey: https://repos.influxdata.com/influxdb.key

#- name: install telegraf package
# package:
# name: telegraf
# state: present
# notify: restart telegraf

- name: install telegraf package
package:
yum:
name: telegraf
state: present
notify: restart telegraf
disable_gpg_check: yes

- name: enable the telegraf service
service:
Expand Down
32 changes: 32 additions & 0 deletions roles/packer/files/all.pkr.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ variable "aws_region" {}
variable "aws_instance_type" {}
variable "aws_arch" {}

variable "azure_region" {}
variable "azure_instance_type" {}
variable "azure_resource_group" {}
variable "azure_virtual_network" {}
variable "azure_virtual_network_subnet" {}
variable "azure_dns_zone" {}

variable "oracle_availability_domain" {}
variable "oracle_base_image_ocid" {}
variable "oracle_compartment_ocid" {}
Expand Down Expand Up @@ -91,6 +98,22 @@ source "amazon-ebs" "aws" {
}
}

source "azure-arm" "azure" {
managed_image_name = "${var.destination_image_name}-${var.cluster}-v{{timestamp}}"
managed_image_resource_group_name = var.azure_resource_group
build_resource_group_name = var.azure_resource_group
virtual_network_name = var.azure_virtual_network
virtual_network_subnet_name = var.azure_virtual_network_subnet
virtual_network_resource_group_name = var.azure_resource_group
vm_size = var.azure_instance_type
ssh_username = var.ssh_username
os_type = "Linux"
image_publisher = "OpenLogic"
image_offer = "CentOS"
image_sku = "8_4-gen2"
}


source "oracle-oci" "oracle" {
image_name = "${var.destination_image_name}-${var.cluster}-v{{timestamp}}"
availability_domain = var.oracle_availability_domain
Expand All @@ -111,6 +134,7 @@ build {
"source.googlecompute.google",
"source.amazon-ebs.aws",
"source.oracle-oci.oracle",
"source.azure-arm.azure",
]

provisioner "file" {
Expand Down Expand Up @@ -161,4 +185,12 @@ build {
provisioner "shell" {
script = "/home/citc/compute_image_extra.sh"
}

provisioner "shell" {
script = "/home/citc/install_cvmfs_eessi.sh"
}

provisioner "shell" {
script = "/home/citc/compute_image_finalize.sh"
}
}
2 changes: 1 addition & 1 deletion roles/packer/files/compute_image_extra.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@
# sudo yum -y install cmake gcc-gfortran

# to install CernVM-FS and configure access to EESSI, uncomment the line below:
# /home/citc/install_cvmfs_eessi.sh
#/home/citc/install_cvmfs_eessi.sh
3 changes: 3 additions & 0 deletions roles/packer/files/compute_image_finalize.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#! /bin/bash

/usr/sbin/waagent -force -deprovision && export HISTSIZE=0 && sync
8 changes: 8 additions & 0 deletions roles/packer/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,14 @@
group: citc
mode: u=rw,g=rw,o=

- name: copy in packer finalize run script template
copy:
src: compute_image_finalize.sh
dest: /home/citc/compute_image_finalize.sh
owner: citc
group: citc
mode: u=rw,g=rw,o=

- name: copy in EESSI install script
copy:
src: install_cvmfs_eessi.sh
Expand Down
12 changes: 11 additions & 1 deletion roles/packer/templates/prepare_ansible.sh.j2
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,24 @@ $(hostname)
cluster_id={{ startnode_config.cluster_id }}
packer_run=yes
EOF'
{% if ansible_local.citc.csp in ["aws", "google"] %}
{% if ansible_local.citc.csp in ["aws", "google", "azure"] %}
sudo yum install -y epel-release
sudo dnf config-manager --set-enabled powertools
{% elif ansible_local.citc.csp == "oracle" %}
sudo dnf install -y oracle-epel-release-el8
sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf config-manager --set-enabled ol8_codeready_builder
{% endif %}
{% if ansible_local.citc.csp in ["azure"] %}
echo "[main]" | sudo tee -a /etc/NetworkManager/conf.d/dhclient.conf
echo "dhcp=dhclient" | sudo tee -a /etc/NetworkManager/conf.d/dhclient.conf
cat /etc/NetworkManager/conf.d/dhclient.conf
echo "append domain-name \" {{ startnode_config.dns_zone }}\";" | sudo tee -a /etc/dhcp/dhclient.conf
echo "append domain-search \" {{ startnode_config.dns_zone }}\";" | sudo tee -a /etc/dhcp/dhclient.conf
sudo cat /etc/dhcp/dhclient.conf
sudo systemctl restart NetworkManager
cat /etc/resolv.conf
{% endif %}
sudo dnf install -y ansible git
sudo cat /tmp/hosts
sudo mkdir -p /etc/ansible/facts.d/
Expand Down
9 changes: 8 additions & 1 deletion roles/packer/templates/variables.pkrvars.hcl.j2
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
ca_cert = "{{ ca_cert }}"
cluster = "{{ startnode_config.cluster_id }}"
destination_image_name = "citc-slurm-compute"
ssh_username = "{%- if ansible_local.citc.csp in ["aws", "google"] -%}centos{%- else -%}opc{%- endif -%}"
ssh_username = "{%- if ansible_local.citc.csp in ["aws", "google", "azure"] -%}centos{%- else -%}opc{%- endif -%}"

aws_arch = "x86_64"
aws_region = "{%- if startnode_config.region is defined -%}{{ startnode_config.region }}{%- endif -%}"
aws_instance_type = "t2.nano"

azure_region = "{%- if startnode_config.region is defined -%}{{ startnode_config.region }}{%- endif -%}"
azure_resource_group = "{%- if startnode_config.resource_group is defined -%}{{ startnode_config.resource_group }}{%- endif -%}"
azure_virtual_network = "{%- if startnode_config.virtual_network is defined -%}{{ startnode_config.virtual_network }}{%- endif -%}"
azure_virtual_network_subnet = "{%- if startnode_config.virtual_network_subnet is defined -%}{{ startnode_config.virtual_network_subnet }}{%- endif -%}"
azure_dns_zone = "{%- if startnode_config.dns_zone is defined -%}{{ startnode_config.dns_zone }}{%- endif -%}"
azure_instance_type = "Standard_D4s_v3"

google_destination_image_family = "citc-slurm-compute"
google_network = "{%- if startnode_config.network_name is defined -%}{{ startnode_config.network_name }}{%- endif -%}"
google_source_image_family = "centos-8"
Expand Down
Loading