Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] preBootstrapCommands is not working in AL2023 #7903

Open
xiangyanw opened this issue Jul 29, 2024 · 11 comments · May be fixed by #8031
Open

[Bug] preBootstrapCommands is not working in AL2023 #7903

xiangyanw opened this issue Jul 29, 2024 · 11 comments · May be fixed by #8031
Labels
kind/bug priority/important-soon Ideally to be resolved in time for the next release

Comments

@xiangyanw
Copy link

What were you trying to accomplish?

I want to mount a data volume to EKS node with AL2023 by preBootstrapCommands.

What happened?

I configured preBootstrapCommands for a managed nodegroup in EKS version 1.30, but those commands were not added to the userdata.

Here is my preBootstrapCommands:

    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"

Here is the resulting userdata in the launchtemplate:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40

--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40
Content-Type: text/x-shellscript
Content-Type: charset="us-ascii"

#!/bin/bash

set -o errexit
set -o pipefail
set -o nounset

touch /run/xtables.lock

--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40--

How to reproduce it?

Use the following YAML to create a nodegroup for EKS 1.30. Execute command: eksctl create ng -f xxx.yaml

  - name: nodegroup
    instanceType: c6a.large
    minSize: 0
    desiredCapacity: 1
    maxSize: 2
    volumeSize: 30
    volumeType: 'gp3'
    privateNetworking: true
    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"
    additionalVolumes:
      - volumeName: '/dev/xvdb' # required
        volumeSize: 50
        volumeType: 'gp3'

Logs
2024-07-29 03:13:13 [ℹ] nodegroup "xxxx-nodegroup" will use "" [AmazonLinux2023/1.30]
2024-07-29 03:13:13 [ℹ] nodegroup "nodegroup" will use "" [AmazonLinux2023/1.30]
2024-07-29 03:13:17 [ℹ] 1 existing nodegroup(s) (xxxx-nodegroup) will be excluded
2024-07-29 03:13:17 [ℹ] 1 nodegroup (nodegroup) was included (based on the include/exclude rules)
2024-07-29 03:13:17 [ℹ] will create a CloudFormation stack for each of 1 managed nodegroups in cluster "xxxx"
2024-07-29 03:13:17 [ℹ]
2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed nodegroup "nodegroup" } }
}
2024-07-29 03:13:17 [ℹ] checking cluster stack for missing resources
2024-07-29 03:13:19 [ℹ] cluster stack has all required resources
2024-07-29 03:13:21 [ℹ] building managed nodegroup stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:22 [ℹ] deploying stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:53 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:14:44 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:16:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:16:22 [ℹ] no tasks
2024-07-29 03:16:22 [✔] created 0 nodegroup(s) in cluster "xxxx"
2024-07-29 03:16:22 [✔] created 1 managed nodegroup(s) in cluster "xxxx"
2024-07-29 03:16:24 [ℹ] checking security group configuration for all nodegroups
2024-07-29 03:16:24 [ℹ] all nodegroups have up-to-date cloudformation templates

Anything else we need to know?
This is working as expected when I use AL2 AMI in the same cluster.

  - name: nodegroup2
    amiFamily: AmazonLinux2
    instanceType: c6a.large
    minSize: 0
    desiredCapacity: 1
    maxSize: 2
    volumeSize: 30
    volumeType: 'gp3'
    privateNetworking: true
    preBootstrapCommands:
      - "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
      - "sudo mount -a"
    additionalVolumes:
      - volumeName: '/dev/xvdb' # required
        volumeSize: 50
        volumeType: 'gp3'

Versions

eksctl version: 0.187.0
kubectl version: v1.24.0
OS: linux
@cPu1
Copy link
Collaborator

cPu1 commented Jul 29, 2024

preBootstrapCommands is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, so create nodegroup silently ignores that field rather than failing early with an error. We'll work on a fix soon.

@xiangyanw
Copy link
Author

What is the alternative if preBootstrapCommands is not supported for AL2023?

@oekarlsson
Copy link

What is the alternative if preBootstrapCommands is not supported for AL2023?

I agree, what should we use instead? The question perhaps should be: Are there any plans to create something more or less equivalent to preBootstrapCommands available in AL2023? This is the one thing that stops us from using AL2023.

@OlGe404
Copy link

OlGe404 commented Aug 22, 2024

we NEED preBootstrapCommands to work because we rely on it to provide custom ca-certificates to pull container images from a private container registry

@jamieavins
Copy link

preBootstrapCommands is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, so create nodegroup silently ignores that field rather than failing early with an error. We'll work on a fix soon.

AL2023 is now the default, so please understand this is going to affect a lot of customers without them even realizing it.

@TiberiuGC TiberiuGC added the priority/important-soon Ideally to be resolved in time for the next release label Sep 4, 2024
@found-it
Copy link

@TiberiuGC any update on when something will be supported for AL2023?

@TiberiuGC
Copy link
Collaborator

AL2023 is now the default, so please understand this is going to affect a lot of customers without them even realizing it.

My take on this is that the most urgent matter is adding a validation for managed nodegroups, so that we don't end up impacting customers in the way described above. We'll likely have a fix for this next week.

As for preBootstrapCommand / overrideBootstrapCommand alternatives for AL2023, I don't have a date to share yet. I'll bump this internally so we can correctly asses where it stands in our backlog of priorities. But I can appreciate there's considerable community interest, I'll make sure to articulate that.

@jonathanfoster
Copy link
Contributor

@TiberiuGC Just ran into this issue myself and burned a few hours troubleshooting. I use preBootstrapCommands to inject HTTP proxy env vars and this is a must have for working in a locked down corporate environment.

A warning message with instructions to fallback to Amazon Linux 2 would be helpful, but this is really a showstopper for enterprise customers. I simply can't use AL2023 without injecting HTTP proxy settings.

Also tell management this disproportionally impacts enterprise customers who have fat budgets and are looking to spin up massive instances to run their internal apps that maybe a handful of people actually use and then turn around and forget they're running...forever. So much compute billing...

Happy to help however I can. Where would one start if they're interested in injecting preBootstrapCommands in AL2023?

@cPu1
Copy link
Collaborator

cPu1 commented Oct 7, 2024

@jonathanfoster, we are working on adding support for preBootstrapCommands in AL2023. Please stay tuned.

@carlstlaurent
Copy link

Any ETA ?

@gedhean
Copy link

gedhean commented Oct 22, 2024

+1 🆙

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug priority/important-soon Ideally to be resolved in time for the next release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants