Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 2024.10.0 appears to have broken xterm.js (e.g. Proxmox WebUI) when served over CF tunnel #1337

Closed
itsthejb opened this issue Oct 15, 2024 · 18 comments · May be fixed by #1340
Closed

🐛 2024.10.0 appears to have broken xterm.js (e.g. Proxmox WebUI) when served over CF tunnel #1337

itsthejb opened this issue Oct 15, 2024 · 18 comments · May be fixed by #1340
Labels
Priority: Normal Minor issue impacting one or more users Type: Bug Something isn't working

Comments

@itsthejb
Copy link

itsthejb commented Oct 15, 2024

Describe the bug

Prior to 2024.10.0 it was possible to serve Proxmox web UI over a CF tunnel. However since 2024.10.0 the default xterm.js CT/VM web console is failing with Connection failed (Error 501: ). The noVNC implementation continues to work without issues.

To Reproduce
Steps to reproduce the behavior:

  • Create tunnel pointing to Proxmox web UI at port 8006
  • Serve tunnel using client version 2024.10.0
  • Attempt to connect to a LXC/VM console using the >_ Shell button
  • Fails w/ Connection failed (Error 501: )

Expected behavior

  • Possible to connect to the console as prior to 2024.10.0

Environment and versions

  • OS: Proxmox 8.2 (latest) / Debian Bookworm
  • Architecture: x86_64
  • Version: 2024.10.0

Logs and errors

  • I don't currently see any clear error log entry from journalctl for cloudflared when the failure happens

Additional context

  • Would like to test a downgrade to 2024.9.1 to verify the problem exactly, but not sure how this is possible Verified that the issue was introduced after 2024.9.1 by restoring to a backup w/ 2024.9.1, where everything works as before
  • In my case, cloudflared installed w/ apt in a Debian Bookworm LXC
@itsthejb itsthejb added Priority: Normal Minor issue impacting one or more users Type: Bug Something isn't working labels Oct 15, 2024
@OdoMarTus
Copy link

OdoMarTus commented Oct 15, 2024

I confirm.
I have the same problem.
However, it appeared for me about two or three weeks ago.
There are also reports that the problem also appeared at the turn of May and June:
https://forum.proxmox.com/threads/error-connection-error-501-not-implemented.11767/post-95568

On my OpenWRT router I have Cloudflare version: 2024.4.1-2 and it causes the above problem
The agent on Linux version 2024.9.1 is correct, but in version: 2024.10.0 is not correct.

@itsthejb
Copy link
Author

itsthejb commented Oct 15, 2024

Confirmed that, at least in my case, the issue only began after 2024.9.1. I reverted to a Proxmox CT backup that had [email protected] and I'm able to work around the issue introduced in 2024.10.0. This forces me to hold the package, however...

@q0Jop
Copy link

q0Jop commented Oct 15, 2024

I can also confirm. I did not run into this issue until upgrading to 2024.10.0 recently.

@kmendell
Copy link

I confirm this issue as well.

@maggie44
Copy link

maggie44 commented Oct 17, 2024

I think it is probably this commit: e2064c8#diff-6ff9dc179d3cf0539f514218792ce1d93001e2b4835a695bde442ad45edf01deR516 @chungthuang

It looks like it used to be that if it is not a websocket (false), then req.Body is left as it is (not set to http.NoBody).

Now it is checking if RequestBodyHintEmpty.String() exists and if it does returns true, which sets req.Body to http.NoBody, which could be Websocket incompatible. Websockets would have empty bodies, so if the metadata is being applied, then this would now be configuring a NoBody, when before it didn't (I assume websockets were excluded from the logic for a reason).

I suspect this has wider implications on websockets more broadly beyond Proxmox.

I think it would be worth adding a websocket to the tests too. At the moment the tests in that commit only covers HTTP so things like this could get missed in the current test coverage.

@maggie44
Copy link

maggie44 commented Oct 18, 2024

Could someone try this PR out to confirm if it fixes the issue?

#1340

Clone it, run make (assuming you have Golang and make installed) and it will build a cloudflared binary. Run that built binary instead of the original and see if it works.

@itsthejb
Copy link
Author

@maggie44 I'll try this out later

@itsthejb
Copy link
Author

I bisected the issue (#1340 (comment)) and found the breaking commit to be e2064c8

@aidendotgg
Copy link

ISSUE #1337 🔥 🤑

@itsthejb
Copy link
Author

Haha yes I also noticed this 🤣

Note there is now a fix for this ready to land hopefully soon

@GoncaloGarcia
Copy link

Hey everyone! Thank you for letting us know of this problem and for the effort in identifying the cause of the issue. We decided to fully revert the change and will take some time to evaluate a better approach. Please let us know if you still see this issue after upgrading to 2024.10.1.

@itsthejb
Copy link
Author

Can confirm things are working as expected in 2024.10.1

@itsthejb
Copy link
Author

itsthejb commented Oct 26, 2024

@GoncaloGarcia In spite of reverting the change, installing the new version it really feels like the interface is less reliable than previous. While the console loads, it frequently disconnects and I have to refresh the page. Do you think this could still be related to other changes in 2024.10.0?

Reopening the issue for the time being for visibility

@itsthejb itsthejb reopened this Oct 26, 2024
@OdoMarTus
Copy link

Fixed in the latest version 2024.10.1

@joliveirinha
Copy link
Contributor

@itsthejb what do you mean by interface? Proxmox interface?

The code in 2024.10.0 had very minimal changes and the only one that could affect traffic was the one we reverted. so it should not be related.
Please open a new issue with more details if you can reproduce it.

@itsthejb
Copy link
Author

@joliveirinha Yes, it's not the original issue, which is definitely fixed. I just noticed the web console stalling and losing the connection but differently. I could indeed be some different, so I just wanted to ask in case you thought there might be something else.

Sure, I'll leave the issue closed. I suspect it is something else, just wanted to ask why the topic is still warm.

Cheers

@itsthejb
Copy link
Author

FYI the disconnection error is:

Screenshot 2024-10-29 at 11 38 08

@clemone210
Copy link

can confirm that with 2024.10.1 the issue is not present anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Normal Minor issue impacting one or more users Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants