Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2024.1: 2023.1 merge #1385

Merged
merged 41 commits into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
3f33b72
Add opensearch logs to the openstack dashboard
technowhizz Mar 15, 2024
26f6c7e
Change metric to retrieve hostname
technowhizz Apr 23, 2024
2d6f175
Add release note
technowhizz Apr 23, 2024
ba3fc8a
Merge branch 'stackhpc/2023.1' into logs_in_grafana
dougszumski Jul 10, 2024
b5c37be
Merge pull request #1317 from stackhpc/merge-yoga-zed
priteau Oct 10, 2024
e036472
Update package update testing instructions (#830)
seunghun1ee Oct 11, 2024
19207e1
Configure IPA with useful inspection settings
priteau Sep 17, 2024
67f0a69
Fix upgrade-prerequisites
Alex-Welsh Oct 16, 2024
4cd6277
Add conditional choice of kolla/globals.yml path
seunghun1ee Nov 1, 2024
de697bd
Redfish exporter: Fixes scrape group
jovial Nov 1, 2024
6f7051d
Merge pull request #1357 from stackhpc/2023.1/fix-scrape-group
Alex-Welsh Nov 4, 2024
ecc15f5
Fix Rabbit upgrade script conditional evaluation
Alex-Welsh Nov 4, 2024
363c1c8
Redfish exporter: Decrease sensitivity of alert (#1358)
jovial Nov 4, 2024
d229d41
fix!: manage the `physical` interface in `ci-aio`
jackhodgkiss Nov 4, 2024
9942b12
Merge pull request #1361 from stackhpc/fix-ci-aio-networking
jackhodgkiss Nov 5, 2024
f4f8899
fix!: manage the `physical` interface in `ci-aio`
jackhodgkiss Nov 4, 2024
77dce90
fix: test if `admin-openrc.sh` exists before deploying os-capacity
jackhodgkiss Oct 25, 2024
b19fd88
Merge pull request #1364 from stackhpc/zed/fix-ci-aio-networking
priteau Nov 6, 2024
6682a6f
fix!: manage the `physical` interface in `ci-aio`
jackhodgkiss Nov 4, 2024
f0e1cf5
CIS: Stop recursively setting permissions on logs files
jovial Nov 6, 2024
294278b
Merge pull request #1363 from stackhpc/2023.1/fix-ci-aio-networking
jackhodgkiss Nov 6, 2024
bc050c1
Merge branch 'stackhpc/2023.1' into bugfix/cis-log-perms
Alex-Welsh Nov 6, 2024
e8c03a2
Merge pull request #1345 from stackhpc/2023.1-os-capacity-no-admin
Alex-Welsh Nov 6, 2024
69115f7
Merge pull request #1366 from stackhpc/bugfix/cis-log-perms
Alex-Welsh Nov 7, 2024
51b3925
Merge pull request #713 from stackhpc/logs_in_grafana
technowhizz Nov 8, 2024
a3c08c0
Add upgrade host configure warning for ceph nodes
Alex-Welsh Nov 8, 2024
1bdc619
Update Blazar image (#1334)
assumptionsandg Nov 12, 2024
f945151
Merge pull request #186 from stackhpc/ipa-extra-hardware
Alex-Welsh Nov 12, 2024
548a236
Merge pull request #1335 from stackhpc/upgrade-prerequisites
Alex-Welsh Nov 13, 2024
f21cbf5
Pin os-capacity to v0.5 release (#1365)
JohnGarbutt Nov 13, 2024
6293f7f
Merge pull request #1370 from stackhpc/host-cofnig-wwarning
Alex-Welsh Nov 14, 2024
b9fde8b
docs: fix link to release train page
priteau Nov 20, 2024
5e460b6
Merge pull request #1373 from stackhpc/doc-fix
priteau Nov 21, 2024
6975cb1
Bump kayobe-automation
priteau Nov 22, 2024
979d9a0
Merge pull request #1381 from stackhpc/bump-kayobe-automation
priteau Nov 22, 2024
aec8f27
Merge stackhpc/yoga into stackhpc/zed
priteau Nov 22, 2024
89ed2e8
Merge pull request #1383 from stackhpc/zed-yoga-merge
priteau Nov 22, 2024
26a51aa
Merge stackhpc/zed into stackhpc/2023.1
priteau Nov 22, 2024
4cbf0b2
Merge pull request #1384 from stackhpc/2023.1-zed-merge
priteau Nov 22, 2024
9de90d0
Merge stackhpc/2023.1 into stackhpc/2024.1
priteau Nov 25, 2024
94c132c
Rebuild Blazar images for 2024.1
priteau Nov 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .automation
2 changes: 2 additions & 0 deletions doc/source/configuration/release-train.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _stackhpc_release_train:

======================
StackHPC Release Train
======================
Expand Down
43 changes: 28 additions & 15 deletions doc/source/contributor/package-updates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,18 +63,20 @@ The following steps describe the process to test the new package and container r
Creating the multinode environments
-----------------------------------

There is a comprehensive guide to setting up a multinode environment with Terraform, found here: https://github.com/stackhpc/terraform-kayobe-multinode. There are some things to note:
The `Multinode deployment workflow <https://github.com/stackhpc/stackhpc-kayobe-config/actions/workflows/stackhpc-multinode.yml>`_ can be used to automatically test changes.

To manually test the changes, there is a comprehensive guide to set up a Multinode environment with Terraform, found here: https://github.com/stackhpc/terraform-kayobe-multinode. There are some things to note:

* OVN is enabled by default, you should override it under ``etc/kayobe/environments/ci-multinode/kolla.yml kolla_enable_ovn: false`` for the OVS multinode environment.

* Remember to set different vxlan_vnis for each.
* Remember to set a different ``vxlan_vni`` for each.

* Before starting any tests, run ``dnf distro-sync`` on each host to ensure you are using the same snapshots as in the release train. You can do this using the following commands:
* Before starting any tests, run ``dnf distro-sync -y`` on each host to ensure you are using the same snapshots as in the release train. Option ``-y`` is used to prevent hosts hang waiting for the confirmation input. You can do this using the following commands:

.. code-block:: console

kayobe seed host command run -b --command "dnf distro-sync"
kayobe overcloud host command run -b --command "dnf distro-sync"
kayobe seed host command run -b --command "dnf distro-sync -y"
kayobe overcloud host command run -b --command "dnf distro-sync -y"

* This may have installed a new kernel version. If so, you will need to reboot the overcloud hosts. You can check the installed kernels and the currently running kernel with the following commands. If the latest listed version is not running, you will need to reboot.

Expand All @@ -85,7 +87,7 @@ There is a comprehensive guide to setting up a multinode environment with Terraf

kayobe playbook run --limit seed,overcloud $KAYOBE_CONFIG_PATH/ansible/reboot.yml

* The tempest tests run automatically at the end of deploy-openstack.sh. If you have the time, it is worth fixing any failing tests you can so that there is greater coverage for the package updates. (Also remember to propose these fixes in the relevant repos where applicable.)
* The tempest tests run automatically at the end of the multinode deployment script. If you have the time, it is worth fixing any failing tests you can so that there is greater coverage for the package updates. (Also remember to propose these fixes in the relevant repos where applicable.)

Upgrading host packages
-----------------------
Expand All @@ -102,6 +104,7 @@ For Rocky Linux 9, bump the snapshot versions in /etc/yum/repos.d with:

.. code-block:: console

kayobe seed host configure -t dnf
kayobe overcloud host configure -t dnf

Install new packages:
Expand All @@ -112,22 +115,32 @@ Install new packages:

Perform a rolling reboot of hosts:

.. note::
In the Multinode environment, the seed-hypervisor cannot access control
plane instances with the Openstack client. To use Openstack client, connect
to the Seed instance via SSH first. For authentication, use scp to copy
``public-openrc.sh`` to the Seed

.. code-block:: console

export ANSIBLE_SERIAL=1
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit controllers
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[0]
# Check your hypervisor hostname
(seed) openstack hypervisor list

# Reboot controller instances and zeroth compute instance
(seed-hypervisor) export ANSIBLE_SERIAL=1
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit controllers
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[0]

# Test live migration
openstack server create --image cirros --flavor m1.tiny --network external --hypervisor-hostname antelope-pkg-refresh-ovs-compute-02.novalocal --os-compute-api-version 2.74 server1
openstack server migrate --live-migration server1
watch openstack server show server1
(seed) openstack server create --image cirros --flavor m1.tiny --network external --hypervisor-hostname <Your Hypervisor Hostname> --os-compute-api-version 2.74 server1
(seed) openstack server migrate --live-migration server1
(seed) watch openstack server show server1

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[1]
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[1]

# Try and migrate back
openstack server migrate --live-migration server1
watch openstack server show server1
(seed) openstack server migrate --live-migration server1
(seed) watch openstack server show server1

Upgrading containers within a release
-------------------------------------
Expand Down
23 changes: 16 additions & 7 deletions doc/source/operations/upgrading-openstack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -449,9 +449,8 @@ To upgrade the Ansible control host:
Syncing Release Train artifacts
-------------------------------

New `StackHPC Release Train <../configuration/release-train>` content should be
synced to the local Pulp server. This includes host packages (Deb/RPM) and
container images.
New :ref:`stackhpc_release_train` content should be synced to the local Pulp
server. This includes host packages (Deb/RPM) and container images.

.. _sync-rt-package-repos:

Expand Down Expand Up @@ -968,17 +967,27 @@ would be applied:
kayobe overcloud host configure --check --diff

When ready to apply the changes, it may be advisable to do so in batches, or at
least start with a small number of hosts.:
least start with a small number of hosts:

.. code-block:: console

kayobe overcloud host configure --limit <host>

Alternatively, to apply the configuration to all hosts:

.. code-block:: console
.. warning::

Take extra care when configuring Ceph hosts. Set the hosts to maintenance
mode before reconfiguring them, and unset when done:

.. code-block:: console

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-enter-maintenance.yml --limit <host>
kayobe overcloud host configure --limit <host>
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-exit-maintenance.yml --limit <host>

kayobe overcloud host configure
**Always** reconfigure hosts in small batches or one-by-one. Check the Ceph
state after each host configuration. Ensure all warnings and errors are
resolved before moving on.

.. _building_ironic_deployment_images:

Expand Down
104 changes: 53 additions & 51 deletions etc/kayobe/ansible/deploy-os-capacity-exporter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,59 +15,61 @@
tags: os_capacity
gather_facts: false
tasks:
- name: Create os-capacity directory
ansible.builtin.file:
path: /opt/kayobe/os-capacity/
state: directory
when: stackhpc_enable_os_capacity

- name: Read admin-openrc credential file
ansible.builtin.command:
cmd: "cat {{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
- name: Check if admin-openrc.sh exists
ansible.builtin.stat:
path: "{{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
delegate_to: localhost
register: credential
when: stackhpc_enable_os_capacity
changed_when: false
register: openrc_file_stat
run_once: true

- name: Set facts for admin credentials
ansible.builtin.set_fact:
stackhpc_os_capacity_auth_url: "{{ credential.stdout_lines | select('match', '.*OS_AUTH_URL*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_project_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_domain_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_DOMAIN_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_openstack_region_name: "{{ credential.stdout_lines | select('match', '.*OS_REGION_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_username: "{{ credential.stdout_lines | select('match', '.*OS_USERNAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_password: "{{ credential.stdout_lines | select('match', '.*OS_PASSWORD*.') | first | split('=') | last | replace(\"'\",'') }}"
when: stackhpc_enable_os_capacity
- block:
- name: Create os-capacity directory
ansible.builtin.file:
path: /opt/kayobe/os-capacity/
state: directory

- name: Template clouds.yml
ansible.builtin.template:
src: templates/os_capacity-clouds.yml.j2
dest: /opt/kayobe/os-capacity/clouds.yaml
when: stackhpc_enable_os_capacity
register: clouds_yaml_result
- name: Read admin-openrc credential file
ansible.builtin.command:
cmd: "cat {{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
delegate_to: localhost
register: credential
changed_when: false

- name: Copy CA certificate to OpenStack Capacity nodes
ansible.builtin.copy:
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
dest: /opt/kayobe/os-capacity/cacert.pem
when:
- stackhpc_enable_os_capacity
- stackhpc_os_capacity_openstack_cacert | length > 0
register: cacert_result
- name: Set facts for admin credentials
ansible.builtin.set_fact:
stackhpc_os_capacity_auth_url: "{{ credential.stdout_lines | select('match', '.*OS_AUTH_URL*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_project_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_domain_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_DOMAIN_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_openstack_region_name: "{{ credential.stdout_lines | select('match', '.*OS_REGION_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_username: "{{ credential.stdout_lines | select('match', '.*OS_USERNAME*.') | first | split('=') | last | replace(\"'\",'') }}"
stackhpc_os_capacity_password: "{{ credential.stdout_lines | select('match', '.*OS_PASSWORD*.') | first | split('=') | last | replace(\"'\",'') }}"

- name: Ensure os_capacity container is running
community.docker.docker_container:
name: os_capacity
image: ghcr.io/stackhpc/os-capacity:master
env:
OS_CLOUD: openstack
OS_CLIENT_CONFIG_FILE: /etc/openstack/clouds.yaml
mounts:
- type: bind
source: /opt/kayobe/os-capacity/
target: /etc/openstack/
network_mode: host
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
restart_policy: unless-stopped
become: true
when: stackhpc_enable_os_capacity
- name: Template clouds.yml
ansible.builtin.template:
src: templates/os_capacity-clouds.yml.j2
dest: /opt/kayobe/os-capacity/clouds.yaml
register: clouds_yaml_result

- name: Copy CA certificate to OpenStack Capacity nodes
ansible.builtin.copy:
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
dest: /opt/kayobe/os-capacity/cacert.pem
when: stackhpc_os_capacity_openstack_cacert | length > 0
register: cacert_result

- name: Ensure os_capacity container is running
community.docker.docker_container:
name: os_capacity
image: ghcr.io/stackhpc/os-capacity:{{ stackhpc_os_capacity_version }}
env:
OS_CLOUD: openstack
OS_CLIENT_CONFIG_FILE: /etc/openstack/clouds.yaml
mounts:
- type: bind
source: /opt/kayobe/os-capacity/
target: /etc/openstack/
network_mode: host
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
restart_policy: unless-stopped
become: true
when: stackhpc_enable_os_capacity and openrc_file_stat.stat.exists
6 changes: 6 additions & 0 deletions etc/kayobe/inventory/group_vars/cis-hardening/cis
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@ rhel9cis_rule_6_1_15: false
# filesystem. We do not want to change /var/lib/docker permissions.
rhel9cis_no_world_write_adjust: false

# Prevent hardening from recursivley changing permissions on log files
rhel9cis_rule_4_2_3: false

# Configure log rotation to prevent audit logs from filling the disk
rhel9cis_auditd:
space_left_action: syslog
Expand Down Expand Up @@ -153,6 +156,9 @@ ubtu22cis_no_owner_adjust: false
ubtu22cis_no_world_write_adjust: false
ubtu22cis_suid_adjust: false

# Prevent hardening from recursivley changing permissions on log files
ubtu22cis_rule_4_2_3: false

# Configure log rotation to prevent audit logs from filling the disk
ubtu22cis_auditd:
action_mail_acct: root
Expand Down
15 changes: 12 additions & 3 deletions etc/kayobe/ipa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@

# List of additional Diskimage Builder (DIB) elements to use when building IPA
# images. Default is none.
#ipa_build_dib_elements_extra:
ipa_build_dib_elements_extra:
- extra-hardware
- mellanox

# List of Diskimage Builder (DIB) elements to use when building IPA images.
# Default is combination of ipa_build_dib_elements_default and
Expand Down Expand Up @@ -117,7 +119,10 @@
#ipa_collectors_default:

# List of additional inspection collectors to run.
#ipa_collectors_extra:
ipa_collectors_extra:
- "dmi-decode"
- "extra-hardware"
- "numa-topology"

# List of inspection collectors to run.
#ipa_collectors:
Expand All @@ -135,7 +140,11 @@
#ipa_kernel_options_default:

# List of additional kernel parameters for Ironic python agent.
#ipa_kernel_options_extra:
ipa_kernel_options_extra:
# Useful until NTP is configured by default
- ipa-insecure=1
# Avoid disk benchmark failures on some NVMe drives
- nvme_core.multipath=N

# List of kernel parameters for Ironic python agent.
#ipa_kernel_options:
Expand Down
3 changes: 3 additions & 0 deletions etc/kayobe/kolla-image-tags.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ kolla_image_tags:
openstack:
rocky-9: 2024.1-rocky-9-20240903T113235
ubuntu-jammy: 2024.1-ubuntu-jammy-20240917T091559
blazar:
rocky-9: 2024.1-rocky-9-20241125T093138
ubuntu-jammy: 2024.1-ubuntu-jammy-20241125T093138
heat:
rocky-9: 2024.1-rocky-9-20240805T142526
nova:
Expand Down
4 changes: 4 additions & 0 deletions etc/kayobe/kolla.yml
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,10 @@ kolla_sources:
type: git
location: https://github.com/stackhpc/octavia.git
reference: stackhpc/{{ openstack_release }}
blazar-base:
type: git
location: https://github.com/stackhpc/blazar
reference: stackhpc/master
priteau marked this conversation as resolved.
Show resolved Hide resolved

###############################################################################
# Kolla image build configuration.
Expand Down
Loading