Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support WSL2 distros accessing private (apiserver, ssh, cockpit) services #3337

Closed
wants to merge 3 commits into from

Conversation

GingerGeek
Copy link

Addresses: Issue #374

Solution/Idea

Bind Windows host-side vSock ports to the Window's machines IP on the WSL2 network, allowing both Windows host and WSL2 to connect to VM services that aren't shared publically.

Proposed changes

Adds decision-making on the "local" address for API Server, SSH server and DNS resolution of routes when these are using vsock as transport on windows.

In addition, the injected DNS routes point to the WSL2 host address, as these additions are automatically* (see caveats) added into WSL2 so that the domains work across both Windows and WSL2.

Testing

After successfully running start and setup you are able to run tooling from within WSL 2.

zed@THINKZED:~/development/oss/crc$ uname -a
Linux THINKZED 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux
zed@THINKZED:~/development/oss/crc$ oc status
In project default on server https://api.crc.testing:6443

svc/openshift - kubernetes.default.svc.cluster.local
svc/kubernetes - 10.217.4.1:443 -> 6443

View details with 'oc describe <resource>/<name>' or list resources with 'oc get all'.
zed@THINKZED:~/development/oss/crc$ kubectl cluster-info
Kubernetes control plane is running at https://api.crc.testing:6443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Caveats

Changes to the Window's hosts files are not always automatically picked up by WSL2 distributions. After a crc start, if the DNS entries were not already present, WSL2 may need to be restarted with wsl --shutdown.

@openshift-ci
Copy link

openshift-ci bot commented Sep 7, 2022

Hi @GingerGeek. Thanks for your PR.

I'm waiting for a code-ready member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@praveenkumar
Copy link
Member

/assign @anjannath

@praveenkumar
Copy link
Member

/ok-to-test

@gbraad
Copy link
Contributor

gbraad commented Sep 8, 2022

This changes behaviour when WSL2 is installed on a target machine. I would say this should NOT be the default option. Especially your caveat indicates that a problem exists with the hostnames, this can therefore not be considered a reliable solution for regular use. Can this be behind a switch, like --use-wsl2-vwitch or similarly on start?

@gbraad gbraad assigned gbraad and unassigned anjannath Sep 8, 2022
@GingerGeek
Copy link
Author

@gbraad From a "work-outs-the-box" perspective I would suggest this being enabled by default without an additional flag, with potentially an override flag to disable it (although I can't think of why you would need to).

Using WSL2 for development, and having all code repos, toolings, etc., within it, is becoming a more typical pattern for developers using Windows as a daily driver. Especially with integrations of VSCode etc.

From a usability perspective, I would think a developer trying out OKD would want to be able crc setup and crc start and by default have that accessible from their WSL2 environment without needing to get into the weeds of WSL2's or CRC's netstack.

Although there is a behaviour change, it's not a "visible" one to a user since you can still access all services as you would before from Windows. The change in January (dcae454) was made to ensure apiserver, cockpit and SSH were not exposed publically; this effectively retains that as the WSL2 LAN is private to the machine.

The only potential break would be if someone is referencing the cluster directly via 127.0.0.1 within their application or scripts, but given the default configuration of Openshift/OKD (ie routes based off hostname), I think that's unlikely (all docs also refer to hostnames).

On the DNS caveat... To clarify this only exists within WSL2. The behaviour from the Windows perspective remains stable.

I've been looking into this and trying to figure out when/how WSL2 updates DNS configuration. I perhaps was being too pessimistic in my original PR.

WSL2 relies on a service on the Windows machine for DNS, this of course picks up changes instantly from the Windows hosts file. However, the changes won't get reflected in /etc/hosts until a WSL restart.

This would mean there is a potential for removed routes' DNS records to linger within WSL2, but I don't think that's a show-stopping issue.

@praveenkumar
Copy link
Member

/ok-to-test

@praveenkumar
Copy link
Member

/retest

@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

From a "work-outs-the-box" perspective I would suggest this being enabled by default without an additional flag, with potentially an override flag to disable it (although I can't think of why you would need to).

The only potential break would be if someone is referencing the cluster directly via 127.0.0.1

We can't change behaviour as people might have functionality that relies on this. This is practice that people use when DNS is not working as expected or using a VPN route-all setup.

NOTE: the localhost resolving is a necessary functionality to allow this to operate with VPN route-all enabled. Enabling this over the WSL IP renders this functionality invalidated. (Have you tried this in a VPN setup?)

@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

On the DNS caveat... To clarify this only exists within WSL2.

I understand, but this is something that needs additional messaging/feedback or documentation, otherwise we will see people file issues about broken DNS when using this.

Does ICS need a restart? Perhaps if you detect WSL to be available (hence the flag to enable so you don't have to detect), you perform a preflight check to see if DNS is operational; wsl -e host crc.testing or so and test the return like (Host crc.testing not found: 3(NXDOMAIN)) or valid IP. If not, you instruct the user to restart WSL with a wsl --terminate or otherwise.

@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

DNS records to linger within WSL2,

Not ideal... we are called out for doing this on system32/drivers/etc/ too.

We remove records using crc cleanup. Something we also need to do from the WSL distro(s)? Issue is that could be many...

don't think that's a show-stopping issue.

Perhaps, we can check for impact over time. (non-default for now?)

@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

Note: flag can be a config option too... as long as this can be 'enabled'/'disabled', though default might no be the right approach now. Perhaps at a later time. We are still working on some issues around WSL2 and Podman before we commit to this.

@gbraad gbraad added os/windows status/peer review required Peer review by assignee is required before being merged; labels Sep 9, 2022
@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

@code-ready/crc-team adding peer review required as this needs to be tested with a route-all VPN setup. I am sure this will fail as addresses will be resolved on a non-localhost address, and this will not be possible in that case.

@GingerGeek
Copy link
Author

GingerGeek commented Sep 9, 2022

I've done some more testing on DNS, this seems to be the state of play:

For static hosts declaration within the distribution:

  • The /etc/hosts file within the Linux distributions is updated by WSL when the distribution starts
  • This update means the current state of C:\WINDOWS\Systems32\drivers\etc\hosts is included in the /etc/hosts file when the distro starts
  • This behaviour can be disabled by the user via wsl.conf

For DNS resolution within the distribution:

  • The /etc/resolv.conf file within the Linux distribution is updated by WSL when the distribution starts is set by WSL
  • This update sets the nameserver to the windows machine, where the ICS (Internet Connection Services) responds to DNS queries and proxies configured DNS within Windows
  • This behaviour can be disabled by the user via wsl.conf

This leads to the following behaviour under default settings:

  • New hosts added will be immediately picked up by the WSL2 distro, there is no restart required as the Windows DNS services picks up the alterations to C:\WINDOWS\Systems32\drivers\etc\hosts
  • Hosts that are removed will also stop resolving from WSL2, same as above
  • EXCEPTION: Hosts that are removed after the distribution has started, which were present when the WSL2 distribution started, will still resolve within the WSL2 environment.
    • When the distribution is restarted, these hosts will be removed due to the syncing of /etc/hosts

If static hosts updating/syncing is disabled, then the resolution will still function within the WSL2 environment as long as the ICS DNS server is still in use. If a user configures custom nameservers within the WSL2 distribution, then DNS resolution for CRC may fail - although I think that's out of scope to support.

TL;DR: Will work "out-the-box" unless user has some advanced customisations on the WSL2 network setup. There is sometimes a delay in removing hosts from WSL2 but I would view it as a non-issue.

I think a WSL health check would be a good addition, I note currently when I run crc start (even without my changes) I see the following log line, is this an intended failure since it's not a valid apps route?

WARN Failed to query DNS from host: lookup foo.apps-crc.testing: no such host

I will look to implement some basic health checks

@gbraad
Copy link
Contributor

gbraad commented Sep 9, 2022

This behaviour can be disabled by the user via wsl.conf
New hosts added will be immediately picked up by the WSL2 distro

Right, but this is therefore not much of our responsibility. Good... let's not consider this. We are however looking into #2992, so that should 'resolve' and clean automagically.

WARN Failed to query DNS from host: lookup foo.apps-crc.testing: no such host

We used to rely on a catch-all, but this might be an issue (we use dnsmasq for this).
it is a WARN so does not fail startup.

@GingerGeek
Copy link
Author

GingerGeek commented Sep 9, 2022

On the route-all VPN... I think this will vary from vendor to vendor, however many do now have specific support for the internal WSL2 network (e.g Cisco AnyConnect).

At the very least, since the IP we bind to is the machine's own, I believe most VPN clients would allow connections to it (but not the wider network)? This is not really something I have too much experience with.

The exception of course is if there is a routing overlap between VPN networks and WSL2 internal network. There are several open issues around this within microsoft/WSL (e.g microsoft/WSL#5782 microsoft/WSL#4210). I found this blog around it as well

I've tested this using Tailscale and an exit node-enabled (with local network access disabled). It's working fine from both the WSL2 side and Windows side. I can probably also have a test done with an OpenVPN setup but I don't have easy access to the more corporate-y VPNs like AnyConnect etc

@GingerGeek
Copy link
Author

GingerGeek commented Sep 9, 2022

General todo/issues/thoughts:

  • The WSL interface is only created once a distribution has been started. This means the interface doesn't exist on a fresh boot and won't be detected with the current method until you open a distribution. A better way of detecting WSL 2 support is probably by parsing the output of wsl --list.

  • I should probably do a more sensible check for a valid ipv4 address other than the current hack of .startsWith("172."). The range used for WSL2 has changed historically (used to be in 192.168 range) so I would probably just confirm that's IPv4 and that it's a private range.

  • Dangling IPs get left in the hosts file. Upstream library has a pending PR fix already When removing a hostname results in an IP without hostname, delete the IP as well goodhosts/hostsfile#37

  • One thought, is that we could open the vsock listener on both localhost addresses and on the WSL2 address. This would mean localhost connections to apiserver etc would still work on Windows. To ensure cross compatibility, the resolved DNS addresses would need to remain as the WSL2 address.

@GingerGeek
Copy link
Author

So is the consensus I should pull this out to be behind a config option? Something along the lines of

crc config set wsl2-netcompat true/false

I would then suggest

  • If crc start is run with wsl2-netcompat false, and WSL2 is detected, log a warning and suggest enabling (link to docs)?
  • If crc start is run with wsl2-netcompat true, and WSL2 is not detected, fatally error?

@anjannath
Copy link
Member

i've rebased the PR branch and made some modifications to the way the WSL vswitch ip is fetched here: anjannath@bedcc6f

tested it with the podman preset and its working as expected, need to also test with a VPN and the openshift and microshift presets

@gbraad
Copy link
Contributor

gbraad commented Aug 9, 2023

@anjannath

Maintainers are allowed to edit this pull request. so I would suggest to do so. The IP determine function as I said was way too complicted. All is needed is to get the Host IP address, as our stack binds on the 'same' localhost.

@gbraad
Copy link
Contributor

gbraad commented Aug 9, 2023

the shell doesn't detect the WSL env properly
crc.exe podman-env

the detection is not an issue, but the -s option is restricted to the 'supported' shells for the platform. I believe it should be possible to force this

@openshift-ci
Copy link

openshift-ci bot commented Aug 21, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from gbraad. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@anjannath
Copy link
Member

anjannath commented Aug 21, 2023

@gbraad @praveenkumar Updated the PR:

  • rebased and simplified the function fetching the IP of the WSL vswitch
  • added config option wsl-network-access wsl-host-access which defaults to false

@GingerGeek
Copy link
Author

Thanks for picking this up, sorry I've not been able to give this any more time

@gbraad
Copy link
Contributor

gbraad commented Aug 22, 2023

@GingerGeek no worries. life can get in the way of things; it happens to all of us.
it is appreciated if you still use CRC to give this a quick glance or a try. Otherwise, thank you regardless for the contribution.


// API, Cockpit and Internal SSH are all bound to a private address, usually 127.0.0.1
// On Windows, where WSL2 is active, it's bound to the Windows machine's address within the private WSL2 network
privateIP := VsockPrivateAddress()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be behind the config option, now on windows it is always binding the port forwards to the WSL vswitch ip address

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is handled now in e0ef98d

@@ -89,7 +89,7 @@ func (vm *virtualMachine) State() (state.State, error) {

func (vm *virtualMachine) IP() (string, error) {
if vm.vsock {
return "127.0.0.1", nil
return VsockPrivateAddress(), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to https://github.com/crc-org/crc/pull/3337/files#r1302624090 this should also report the WSL vswitch ip address only when wsl-host-access is enabled, but currently its not considering the config option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be easier to toggle this with an env variable instead of the config option, so i suggest instead of having a config option wsl-host-access we can depend on a env variable CRC_WSL_HOST_ACCESS which should toggle this behavior, wdyt @gbraad ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know from where to where the connections would be allowed, but crc has a host-network-access Allow TCP/IP connections from the CRC VM to services running on the host config option, let's try to get the naming of the new option consistent/not confusing with this preexisting option.

Copy link
Member

@anjannath anjannath Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comm. is to be allowed from WSL to crc VM, so that the oc commands work from the WSL terminal

i mistakenly made the config key wsl-host-access but previously wsl2-network-access was suggested in #3337 (comment)

cfg.AddSetting(EnableSharedDirs, false, validateSmbSharedDirs, SuccessfullyApplied,
"Mounts host's user profile folder at '/' in the CRC VM (true/false, default: false)")
// Setting to enable access to CRC vm from wsl
cfg.AddSetting(WSLHostAccess, false, ValidateBool, SuccessfullyApplied,
"Enable acess to CRC VM within WSL environment (true/false, default: false)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

access

@@ -35,6 +35,7 @@ const (
IngressHTTPPort = "ingress-http-port"
IngressHTTPSPort = "ingress-https-port"
EmergencyLogin = "enable-emergency-login"
WSLHostAccess = "wsl-host-access"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title is: "add 'wsl-network-access'" ?

cfg.AddSetting(EnableSharedDirs, false, validateSmbSharedDirs, SuccessfullyApplied,
"Mounts host's user profile folder at '/' in the CRC VM (true/false, default: false)")
// Setting to enable access to CRC vm from wsl
cfg.AddSetting(WSLHostAccess, false, ValidateBool, SuccessfullyApplied,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it network access or host access?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be WSLNetworkAccess updated now

GingerGeek and others added 2 commits October 9, 2023 21:12
…ices

Bind Windows host-side vSock ports to the Window's machines IP on the WSL2 network
allowing both Windows host and WSL2 to connect to VM services that aren't shared publically.
VsockPrivateAddress should be dependent on the wsl-network-access config
setting and return the fallback address when the config is false
@openshift-ci
Copy link

openshift-ci bot commented Oct 9, 2023

@GingerGeek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/integration-crc e0ef98d link true /test integration-crc
ci/prow/e2e-crc e0ef98d link true /test e2e-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@anjannath
Copy link
Member

@gbraad should we close this now, we haven't seen users requesting this feature much, can reopen again later if needed

@gbraad
Copy link
Contributor

gbraad commented Jan 24, 2024

There is a networkingModeyou can set for WSL2, like bridged and mirrored. These can be a possible solution.

Closing it is OK

@anjannath
Copy link
Member

There is a networkingModeyou can set for WSL2, like bridged and mirrored. These can be a possible solution.

Closing it is OK

this is the link to the article about mirrored networkingMode on WSL2 https://learn.microsoft.com/en-us/windows/wsl/networking#mirrored-mode-networking

@anjannath anjannath closed this Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test os/windows status/peer review required Peer review by assignee is required before being merged;
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants