Skip to content

Commit

Permalink
Fix cloud-init blocking on WSL1 (#508)
Browse files Browse the repository at this point in the history
`cloud-init status --wait` blocks forever if cloud-init won't ever
start. Known cases where that may happen:
1. the cloud-init systemd units being masked or disabled by default in
the rootfs (not our case)
1. systemd not enabled by default in the rootfs image (not our case
either)
1. the distro instance being registered as a WSL 1 instance (we don't
control that :) )

We're not considering that properly. With the patch
bb72652,
we check if both the cloud-init.service unit is enabled (which covers 1)
as well as systemd not offline, which covers 2. and 3.
The very same check was implemented in the new script that will replace
this launch in the new WSL format via this PR:
ubuntu/wsl-setup#17.

I also added an end-to-end test case that sets WSL1 as the default
setting globally and attempts to register the instance. Without that
patch the test would timeout. See
472a2d8.

UDENG-5629.

---

CI had two issues blocking me:

1. `releases-info`, a Go binary used to select which rootfses to build
the application packages from, had outdated considerations about what
CPC publishes under https://cloud-images.ubuntu.com/wsl:
- Jammy file naming conventions are now the same as Noble's.
- The CPC publisher no longer strips the "ubuntu" suffix on both
releases.

Look at https://cloud-images.ubuntu.com/wsl/jammy/current/:

```
...
[   ] SHA256SUMS.gpg                                      2024-12-13 15:01  833
[   ] ubuntu-jammy-wsl-amd64-ubuntu.rootfs.tar.gz         2024-12-12 21:42  324M  File system image and Kernel packed
[   ] ubuntu-jammy-wsl-amd64-ubuntu22.04lts.rootfs.tar.gz 2024-12-12 21:42  324M  File system image and Kernel packed
[TXT] ubuntu-jammy-wsl-amd64-wsl.manifest                 2024-12-12 21:42   17K  Package manifest file
[   ] ubuntu-jammy-wsl-arm64-ubuntu.rootfs.tar.gz         2024-12-12 21:45  308M  File system image and Kernel packed
[   ] ubuntu-jammy-wsl-arm64-ubuntu22.04lts.rootfs.tar.gz 2024-12-12 21:45  308M  File system image and Kernel packed
[TXT] ubuntu-jammy-wsl-arm64-wsl.manifest                 2024-12-12 21:45   17K  Package manifest file
...
```

The images contain a suffix "ubuntu22.04lts".
Also, that naming convention used to be from 24.04 onwards.
But as we can clearly see 22.04 and 24.04 are following the same naming
convention
(see https://cloud-images.ubuntu.com/wsl/noble/current/):

```
[   ] SHA256SUMS.gpg                                      2024-12-13 05:37  833
[   ] ubuntu-noble-wsl-amd64-ubuntu.rootfs.tar.gz         2024-12-12 20:21  347M  File system image and Kernel packed
[   ] ubuntu-noble-wsl-amd64-ubuntu24.04lts.rootfs.tar.gz 2024-12-12 20:21  347M  File system image and Kernel packed
[TXT] ubuntu-noble-wsl-amd64-wsl.manifest                 2024-12-12 20:20   16K  Package manifest file
[   ] ubuntu-noble-wsl-arm64-ubuntu.rootfs.tar.gz         2024-12-12 20:23  332M  File system image and Kernel packed
[   ] ubuntu-noble-wsl-arm64-ubuntu24.04lts.rootfs.tar.gz 2024-12-12 20:23  332M  File system image and Kernel packed
[TXT] ubuntu-noble-wsl-arm64-wsl.manifest                 2024-12-12 20:23   16K  Package manifest file

```


2. Every now and then the PowerShell script that downloads the rootfs
and validate the checksums failed with a shallow stack trace of an
attempt to index into a null array.
That was not really helpful. After investing some time on this I could
finally learn that we implemented the wrong regex pattern (two spaces
instead of a space and a raw star/asterisk character).
  • Loading branch information
CarlosNihelton authored Dec 18, 2024
2 parents 7ebeb4e + dbeb8eb commit 76ef2f6
Show file tree
Hide file tree
Showing 10 changed files with 73 additions and 22 deletions.
4 changes: 3 additions & 1 deletion .github/actions/download-rootfs/action.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: Download Ubuntu WSL Rootfs
description: Download the latest Rootfs for a particular release of Ubuntu WSL
description: Download the latest Rootfs for a particular release of Ubuntu WSL

inputs:
distros:
Expand Down Expand Up @@ -93,6 +93,8 @@ runs:
$file = ".\${name}.tar.gz"
Write-Output "Saving $($url[0..19] -join '')***$($url[-19..-1] -join '') to $file"
& ${{ github.action_path }}\download-rootfs.ps1 -Path "${file}" -URL "$url"
if ( ! $? ) {
$allSuccess = $false
Expand Down
9 changes: 7 additions & 2 deletions .github/actions/download-rootfs/download-rootfs.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,14 @@ function Test-Checksums {

# Parse checksum file
$image = $URL.Segments[$Url.Segments.Length - 1]
$pattern = "(\w+) ${image}"
$pattern = "(\w+) \*${image}"
$matched = (Select-String -Pattern "${pattern}" -Path "${Checksums}").Matches
if ($null -eq $matched -or $matched.Count -eq 0) {
Write-Warning "Could not find $image in checksums file"
return $false
}

$newSHA256 = (Select-String -Pattern "${pattern}" -Path "${Checksums}").Matches[0].Groups[1]
$newSHA256 = $matched[0].Groups[1]
if ($newSHA256 -eq "") {
Write-Warning "Could not find $image in checksums file"
return $false
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: windows-latest
if: ${{ !github.event.pull_request.draft }}
env:
rootfs64: 'http://cloud-images.ubuntu.com/wsl/jammy/current/ubuntu-jammy-wsl-amd64-wsl.rootfs.tar.gz'
rootfs64: 'http://cloud-images.ubuntu.com/wsl/jammy/current/ubuntu-jammy-wsl-amd64-ubuntu.rootfs.tar.gz'
workDir: 'C:/Temp/builddir'
steps:
- name: Checkout WSL
Expand Down
9 changes: 7 additions & 2 deletions DistroLauncher/Ubuntu/InitTasks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,14 @@ bool CheckInitTasks(WslApiLoader& api, bool checkDefaultUser) {

namespace {
void waitForInitTasks(WslApiLoader& api) {
// Wait for cloud-init to finish if systemd and its service is enabled.
static constexpr wchar_t script[] = LR"(
if status=$(LANG=C systemctl is-system-running 2>/dev/null) || [ "${status}" != "offline" ] && systemctl is-enabled --quiet cloud-init.service 2>/dev/null; then
cloud-init status --wait > /dev/null 2>&1 || true
fi
)";
DWORD exitCode = -1;
// Try running cloud-init unconditionally, but avoid printing to console.
auto hr = api.WslLaunchInteractive(L"cloud-init status --wait >/dev/null 2>&1", FALSE, &exitCode);
auto hr = api.WslLaunchInteractive(script, FALSE, &exitCode);
if (FAILED(hr)) {
Helpers::PrintErrorMessage(hr);
return;
Expand Down
4 changes: 2 additions & 2 deletions e2e/go.work
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
go 1.22.0
go 1.23.0

toolchain go1.22.5
toolchain go1.23.2

use ./launchertester
16 changes: 14 additions & 2 deletions e2e/launchertester/basic_setup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"context"
"fmt"
"os"
"os/exec"
"path/filepath"
"strconv"
"strings"
Expand All @@ -26,7 +27,7 @@ func TestBasicSetup(t *testing.T) {
require.NoErrorf(t, err, "Unexpected error installing: %s\n%v", out, err)

testCases := map[string]func(t *testing.T){
"SystemdEnabled": testSystemdEnabled,
"SystemdEnabled": testSystemdIsEnabled,
"SystemdUnits": testSystemdUnits,
"CorrectUpgradePolicy": testCorrectUpgradePolicy,
"UpgradePolicyIdempotent": testUpgradePolicyIdempotent,
Expand All @@ -45,6 +46,7 @@ func TestSetupWithCloudInit(t *testing.T) {
testCases := map[string]struct {
install_root bool
withRegistryUser string
withWSL1 bool
wantUser string
wantFile string
}{
Expand All @@ -56,6 +58,8 @@ func TestSetupWithCloudInit(t *testing.T) {
"With only remote users": {wantUser: "testmail"},
"With broken passwd file": {wantUser: "testmail"},
"Without checking user": {install_root: true, wantUser: "root", wantFile: "/home/testuser/with_default_user.done"},
// TODO: Investigate why this causes the CI VM to crash and reenable it.
//"Do not block on WSL1": {install_root: true, withWSL1: true, wantUser: "root"},
}

home, err := os.UserHomeDir()
Expand All @@ -77,6 +81,14 @@ func TestSetupWithCloudInit(t *testing.T) {

for name, tc := range testCases {
t.Run(name, func(t *testing.T) {
if tc.withWSL1 {
require.NoError(t, exec.Command("wsl.exe", "--set-default-version", "1").Run(), "Setup: Cannot set WSL1 as default version")
require.NoError(t, exec.Command("wsl.exe", "--shutdown").Run(), "Setup: Cannot enforce WSL1 as default version y shutting down the VM")
t.Cleanup(func() {
t.Log("Cleaning up: Setting WSL2 back as default version")
require.NoError(t, exec.Command("wsl.exe", "--set-default-version", "2").Run(), "Setup: Cannot set WSL2 back as default version")
})
}
wslSetup(t)

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
Expand Down Expand Up @@ -158,7 +170,7 @@ func TestSetupWithCloudInit(t *testing.T) {
// launcher checks for the default user. Either way the user assertion in the end of this test case will work as exoected.
require.NoError(t, <-registrySet, "Setup: Failed to set default user via GoWSL/registry")

testSystemdEnabled(t)
testSystemdEnabled(t, !tc.withWSL1)
testInteropIsEnabled(t)
if len(tc.wantFile) > 0 {
testFileExists(t, tc.wantFile)
Expand Down
4 changes: 1 addition & 3 deletions e2e/launchertester/go.mod
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
module github.com/ubuntu/wsl/e2e/launchertester

go 1.22.0

toolchain go1.22.3
go 1.23.0

require (
github.com/stretchr/testify v1.9.0
Expand Down
14 changes: 12 additions & 2 deletions e2e/launchertester/test_cases.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,13 @@ func testDefaultUser(t *testing.T, expected string) { //nolint: thelper, this is
require.Equal(t, expected, got, "Default user should be %s, got %s", expected, got)
}

// testSystemdEnabled ensures systemd was enabled.
func testSystemdEnabled(t *testing.T) { //nolint: thelper, this is a test
// testSystemdIsEnabled ensures systemd was enabled.
func testSystemdIsEnabled(t *testing.T) { //nolint: thelper, this is a test
testSystemdEnabled(t, true)
}

// testSystemdEnabled checks whether systemd was enabled or not as expected.
func testSystemdEnabled(t *testing.T, wantEnabled bool) { //nolint: thelper, this is a test
ctx, cancel := context.WithTimeout(context.Background(), systemdBootTimeout)
defer cancel()

Expand All @@ -36,6 +41,11 @@ func testSystemdEnabled(t *testing.T) { //nolint: thelper, this is a test
return // Success: Non-deterministic
}

if !wantEnabled {
require.Contains(t, string(out), "offline", "systemd output should be offline")
return
}

// Only acceptable alternative is "degraded"
var target *exec.ExitError
require.ErrorAs(t, err, &target, "systemctl is-system-running: Only acceptable error is ExitError (caused by systemd being degraded)")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#cloud-config
users:
- name: nontestuser
groups: sudo
shell: /bin/bash
sudo: ALL=(ALL) NOPASSWD:ALL
- name: testuser
groups: sudo
shell: /bin/bash
sudo: ALL=(ALL) NOPASSWD:ALL

write_files:
- path: /etc/wsl.conf
append: true
content: |
[user]
default=testuser

runcmd:
- touch /home/testuser/with_default_user.done
13 changes: 6 additions & 7 deletions wsl-builder/common/releasesinfo.go
Original file line number Diff line number Diff line change
Expand Up @@ -235,20 +235,19 @@ func (w *WslReleaseInfo) refreshedTerminalProfileID() error {

// RootfsURL returns the URL to the rootfs tarball for the given architecture.
// The base image name is in the format:
// ubuntu-<version>-wsl-<arch>-<upgrade-flavor>.rootfs.tar.gz
// before 24.04, upgrade-flavor is always "wsl"
// ubuntu-<version>-wsl-<arch>-<upgrade-type>.rootfs.tar.gz
// before 22.04, upgrade-type is always "wsl"
// otherwise:
// - ubuntu -> wsl (upgade: lts)
// - ubuntu -> wsl (upgrade: lts)
// - ubuntupreview -> preview (upgrade: always)
// - ubuntu24.04lts -> 24.04lts (upgrade: never)
// - ubuntu24.04lts -> ubuntu24.04lts (upgrade: never)
func (w *WslReleaseInfo) RootfsURL(arch string) string {
imageBaseName := fmt.Sprintf("ubuntu-%s-wsl-%s", w.CodeName, arch)

suffix := "wsl"
// We have multiple versions with different upgrade policy management. Pick the one based on the app name.
if strings.Compare(w.BuildVersion, "2404") >= 0 {
// The CPC publisher strip the "ubuntu" prefix
suffix = strings.TrimPrefix(strings.ToLower(w.AppID), "ubuntu")
if strings.Compare(w.BuildVersion, "2204") >= 0 {
suffix = strings.ToLower(w.AppID)
if suffix == "" {
suffix = "wsl"
}
Expand Down

0 comments on commit 76ef2f6

Please sign in to comment.