Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 2.43.0 broke darwin public DNS query when host do not exclusively use IPv4 DNS servers. #4454

Closed
kaovilai opened this issue Nov 12, 2024 · 28 comments · Fixed by #4496
Closed
Assignees
Labels
kind/bug Something isn't working status/need triage

Comments

@kaovilai
Copy link
Contributor

kaovilai commented Nov 12, 2024

General information

  • OS: Linux / macOS / Windows
  • Hypervisor: KVM / Hyper-V / hyperkit / vfkit
  • Did you run crc setup before starting it (Yes/No)?
  • Running CRC on: Laptop / Baremetal-Server / VM

CRC version

# Put `crc version` output here
2.43.0

CRC status

❯ crc status --log-level debug
DEBU CRC version: 2.43.0+268795                   
DEBU OpenShift version: 4.17.1                    
DEBU MicroShift version: 4.17.1                   
DEBU Running 'crc status'                         
CRC VM:          Running
OpenShift:       Running (v4.17.1)
RAM Usage:       5.973GB of 10.93GB
Disk Usage:      18.97GB of 53.08GB (Inside the CRC VM)
Cache Usage:     116.6GB
Cache Directory: /Users/tiger/.crc/cache

CRC config

~ 10m 50s
❯ crc config view
- consent-telemetry                     : no
- cpus                                  : 8
- disk-size                             : 50
- kubeadmin-password                    : crcpass
- pull-secret-file                      : /Users/tiger/pull-secret.txt

Host Operating System

❯ sw_vers
ProductName:		macOS
ProductVersion:		14.7.1
BuildVersion:		23H222

Steps to reproduce

Expected

Actual

Logs

Before gather the logs try following if that fix your issue

$ crc delete -f
$ crc cleanup
$ crc setup
$ crc start --log-level debug

Please consider posting the output of crc start --log-level debug on http://gist.github.com/ and post the link in the issue.
https://gist.github.com/kaovilai/44f9f77e00455d76283542eddd9d59c7#file-2-43-0-curl-error-log-L288

After some tinkering I think I got cluster up.

v2.42.0...v2.43.0

@kaovilai kaovilai added kind/bug Something isn't working status/need triage labels Nov 12, 2024
@cfergeau
Copy link
Contributor

Can you provide more details about the steps you are following to reproduce, and can you share logs of crc startup ?

@kaovilai
Copy link
Contributor Author

FWIW got 2.42.0 working now.

@kaovilai
Copy link
Contributor Author

Here's a fun function I just made..

znap function crc-start-version(){
    # check X.Y.Z version is specified
    if [ -z "$1" ]; then
        echo "No version supplied"
        return 1
    fi
    # check version is semver
    if ! [[ "$1" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
        echo "Version $1 is not semver"
        return 1
    fi
    crc stop; crc delete -f; crc cleanup; (cat Downloads/$1-crc.pkg >/dev/null || curl https://developers.redhat.com/content-gateway/file/pub/openshift-v4/clients/crc/$1/crc-macos-installer.pkg -L -o Downloads/$1-crc.pkg) && sudo installer -pkg Downloads/$1-crc.pkg -target LocalSystem && sw_vers && crc version && crc setup && crc start --log-level debug && crc status --log-level debug
}

@kaovilai
Copy link
Contributor Author

kaovilai commented Nov 12, 2024

@cfergeau I got cluster to come up but the issue with curl for dns remains. AND is specific to 2.43.0

INFO Check internal and public DNS query...       
DEBU Running SSH command: curl --head quay.io     
DEBU SSH command results: err: Process exited with status 6, output:  
WARN Failed public DNS query from the cluster: ssh command error:
command : curl --head quay.io
err     : Process exited with status 6
 :  

@cfergeau
Copy link
Contributor

can you follow the steps here https://github.com/crc-org/crc/wiki/Debugging-guide#entering-the-vm to check network connectivity inside the VM?

@kaovilai
Copy link
Contributor Author

kaovilai commented Nov 12, 2024

Not yet successful with that guide

❯ crc ip
127.0.0.1

~
❯ ssh -i ~/.crc/machines/crc/id_ecdsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -p 2222 [email protected]
Warning: Identity file /Users/tiger/.crc/machines/crc/id_ecdsa not accessible: No such file or directory.
Warning: Permanently added '[127.0.0.1]:2222' (ED25519) to the list of known hosts.
[email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

@cfergeau
Copy link
Contributor

Warning: Identity file /Users/tiger/.crc/machines/crc/id_ecdsa not accessible: No such file or directory.
Check the name of the file in /Users/tiger/.crc/machines/crc/, I think it was renamed recently.

@kaovilai
Copy link
Contributor Author

ssh -i ~/.crc/machines/crc/id_ed25519 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -p 2222 [email protected] works

@kaovilai
Copy link
Contributor Author

from the vm can't curl quay.io

[core@crc ~]$ curl --head quay.io
curl: (6) Could not resolve host: quay.io

@kaovilai
Copy link
Contributor Author

[core@crc ~]$ nmcli dev show | grep DNS
IP4.DNS[1]:                             192.168.127.1
[core@crc ~]$ 

@kaovilai
Copy link
Contributor Author

[core@crc ~]$ curl -vvv --head 1.1.1.1
*   Trying 1.1.1.1:80...
* Connected to 1.1.1.1 (1.1.1.1) port 80 (#0)
> HEAD / HTTP/1.1
> Host: 1.1.1.1
> User-Agent: curl/7.76.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Server: cloudflare
Server: cloudflare
< Date: Tue, 12 Nov 2024 16:55:44 GMT
Date: Tue, 12 Nov 2024 16:55:44 GMT
< Content-Type: text/html
Content-Type: text/html
< Content-Length: 167
Content-Length: 167
< Connection: keep-alive
Connection: keep-alive
< Location: https://1.1.1.1/
Location: https://1.1.1.1/
< CF-RAY: 8e180d299fd85caf-RDU
CF-RAY: 8e180d299fd85caf-RDU

< 
* Connection #0 to host 1.1.1.1 left intact
[core@crc ~]$ curl -vvv --head one.one.one.one
* Could not resolve host: one.one.one.one
* Closing connection 0
curl: (6) Could not resolve host: one.one.one.one
[core@crc ~]$ curl -vvv --head google.com
* Could not resolve host: google.com
* Closing connection 0
curl: (6) Could not resolve host: google.com
[core@crc ~]$ 

no hostnames works afaict

@kaovilai
Copy link
Contributor Author

Upon further inspection.. no 4.17.1 without DNS is not running fine. A bunch of system operators are failing to lookup.

Failed to pull image "registry.redhat.io/redhat/certified-operator-index:v4.17": pinging container registry registry.redhat.io: Get "https://registry.redhat.io/v2/": dial tcp: lookup registry.redhat.io on 192.168.127.1:53: no such host

@kaovilai kaovilai changed the title [BUG] 4.17.1 broke darwin public DNS query [BUG] 2.43.0 broke darwin public DNS query Nov 12, 2024
@kaovilai
Copy link
Contributor Author

im back to 2.42.0 until further instructions.

@cfergeau
Copy link
Contributor

cfergeau commented Nov 13, 2024

My guess is that it's related to 9175c8e
@evidolob any idea what went wrong?
@kaovilai can you paste the content of /etc/resolv.conf on the host as well as scutil --dns | grep 'nameserver\[[0-9]*\]' ?

@kaovilai
Copy link
Contributor Author

crc-start-version 2.43.0;
ssh -i ~/.crc/machines/crc/id_ed25519 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -p 2222 [email protected];
[core@crc ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
search crc.testing
nameserver 192.168.127.1
[core@crc ~]$ scutil --dns | grep 'nameserver\[[0-9]*\]'
-bash: scutil: command not found
[core@crc ~]$ sudo dnf install scutil -y
Updating Subscription Management repositories.
Unable to read consumer identity

This system is not registered with an entitlement server. You can use subscription-manager to register.

Error: There are no enabled repositories in "/etc/yum.repos.d", "/etc/yum/repos.d", "/etc/distro.repos.d".
[core@crc ~]$ subscription-manager
You are attempting to run "subscription-manager" which requires administrative
privileges, but more information is needed in order to do so.
Authenticating as "root"
Password: 

dotfiles for more info

@kaovilai
Copy link
Contributor Author

on the macos machine zsh terminal

~ 4m 14s
❯ cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
search attlocal.net
nameserver 2600:1700:e72:8520::33
nameserver 192.168.1.214
❯ scutil --dns | grep 'nameserver\[[0-9]*\]'
  nameserver[0] : 2600:1700:e72:8520::33
  nameserver[1] : 192.168.1.214
  nameserver[0] : 2600:1700:e72:8520::33
  nameserver[1] : 192.168.1.214
  nameserver[0] : 2600:1700:e72:8520::1
  nameserver[1] : 192.168.1.254

@kaovilai
Copy link
Contributor Author

For the record, on the host system quay works.

❯ curl --head quay.io
HTTP/1.1 301 Moved Permanently
Server: awselb/2.0
Date: Wed, 13 Nov 2024 18:14:46 GMT
Content-Type: text/html
Content-Length: 134
Connection: keep-alive
Location: https://quay.io:443/

@cfergeau
Copy link
Contributor

on the macos machine zsh terminal

~ 4m 14s
❯ cat /etc/resolv.conf
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
search attlocal.net
nameserver 2600:1700:e72:8520::33
nameserver 192.168.1.214

Could be containers/gvisor-tap-vsock#398
gvisor-tap-vsock does not support ipv6 yet, so if it tries to use the ipv6 nameserver for DNS, this is not going to work well.

@kaovilai
Copy link
Contributor Author

2.42.0 worked fine so I guess they technically regressed that repo

@cfergeau
Copy link
Contributor

@kaovilai could you try to swap the 2 entries in /etc/resolv.conf and re run crc delete/crc start?

@kaovilai
Copy link
Contributor Author

For the record I never went or edited this file manually. Everything was done through system preferences.

I will report back after reordering. Will check if the order is same in system preferences first and see if reordering there first works.

@kaovilai
Copy link
Contributor Author

Before: Image
After: Image
I was able to edit via system preferences which I think is the route the OS expects vs editing the resolv.conf directly based on the notice.

@kaovilai
Copy link
Contributor Author

With

❯ cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
search attlocal.net
nameserver 192.168.1.214
nameserver 2600:1700:e72:8520::33

I am still getting

INFO Check internal and public DNS query...       
WARN Failed public DNS query from the cluster: ssh command error:
command : curl --head quay.io
err     : Process exited with status 6
 :  

@kaovilai
Copy link
Contributor Author

Removing ipv6 works
Image
So the order in resolve.conf does not matter.

INFO Check internal and public DNS query...       
DEBU Running SSH command: curl --head quay.io     
DEBU SSH command results: err: <nil>, output: HTTP/1.1 301 Moved Permanently
Server: awselb/2.0
Date: Fri, 15 Nov 2024 17:04:53 GMT
Content-Type: text/html
Content-Length: 134
Connection: keep-alive
Location: https://quay.io:443/

@kaovilai
Copy link
Contributor Author

I will stop troubleshooting now by removing ipv6 dns from this adapter for the time being.

@kaovilai kaovilai changed the title [BUG] 2.43.0 broke darwin public DNS query [BUG] 2.43.0 broke darwin public DNS query when host do not exclusively use IPv4 DNS servers. Nov 18, 2024
@evidolob evidolob self-assigned this Nov 20, 2024
@evidolob
Copy link
Contributor

This is fixed with containers/gvisor-tap-vsock#426 PR

@github-project-automation github-project-automation bot moved this from Todo to Done in Project planning: crc Nov 28, 2024
@kaovilai
Copy link
Contributor Author

kaovilai commented Dec 2, 2024

@evidolob which CRC version will have fix?

@kaovilai
Copy link
Contributor Author

kaovilai commented Dec 2, 2024

Opened #4496

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working status/need triage
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants