-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
win-sshproxy.tid created before thread id is available #433
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at winquit code, calling NotifyOnQuit
is supposed to guarantee that GetCurrentMessageLoopThreadId
returns a non-0 value. However, the thread id is set when NotifyOnQuit
calls messageLoop()
, and this call is done in a go routine, so NotifyOnQuit
can return before the go routine runs and inits the thread id.
Some comments/suggestions, but I'm fine with the PR as is if you prefer to keep it this way.
cmd/win-sshproxy/main.go
Outdated
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) | ||
defer cancel() | ||
|
||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you could reuse this helper
gvisor-tap-vsock/pkg/sshclient/ssh_forwarder.go
Lines 189 to 216 in ec2ed7d
func retry[T comparable](ctx context.Context, retryFunc func() (T, error), retryMsg string) (T, error) { | |
var ( | |
returnVal T | |
err error | |
) | |
backoff := initialBackoff | |
loop: | |
for i := 0; i < maxRetries; i++ { | |
select { | |
case <-ctx.Done(): | |
break loop | |
default: | |
// proceed | |
} | |
returnVal, err = retryFunc() | |
if err == nil { | |
return returnVal, nil | |
} | |
logrus.Debugf("%s (%s)", retryMsg, backoff) | |
sleep(ctx, backoff) | |
backoff = backOff(backoff) | |
} | |
return returnVal, fmt.Errorf("timeout: %w", err) | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. I moved the Retry func in an utils package so it can be reused
@@ -173,11 +174,34 @@ func saveThreadId() (uint32, error) { | |||
return 0, err | |||
} | |||
defer file.Close() | |||
tid := winquit.GetCurrentMessageLoopThreadId() | |||
|
|||
tid, err := getThreadId() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will add a slight delay during win-ssh-proxy startup, do you expect this delay to be problematic in typical use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO no but I have a limited knowledge of its usage. Locally, and for the stuff I do, I didn't even notice.
Maybe it would be noticeable with low resources machine but better to slow it a bit at startup and be sure everything works fine, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect it won't be noticeable. However, if this was noticeable, this would have an impact on podman machine start
startup time, which can be problematic.
Since the thread id is only needed when one wants to stop the podman machine VM, an alternative would be to try to do the waiting and writing of the thread id in a go routine to avoid the blocking.
However, podman would need to be ready for that, and retry reading the file if it's missing, which is not the case at the moment.
With all that said, the current approach should be good enough for now.
@n1hility fwiw, a small race in win-ssh-proxy/winquit. |
this commit fixes a potential race condition that prevented the tests to succeed when running in a github workflow. Basically the thread id was not actually available before writing it on the file, resulting in a thread id equals to 0 written in it. So, when the tests were trying to retrieve the thread id to use it to send the WM_QUIT signal, they failed. This patch adds a check on the thread id before writing it on the file. Now, if the thread id is 0, it keeps calling winquit to retrieve it. If, after 10 secs, there is no success it returns an error. Signed-off-by: lstocchi <[email protected]>
I've created containers/winquit#2 for the underlying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thank you so much for making CI green!
@@ -173,11 +174,34 @@ func saveThreadId() (uint32, error) { | |||
return 0, err | |||
} | |||
defer file.Close() | |||
tid := winquit.GetCurrentMessageLoopThreadId() | |||
|
|||
tid, err := getThreadId() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect it won't be noticeable. However, if this was noticeable, this would have an impact on podman machine start
startup time, which can be problematic.
Since the thread id is only needed when one wants to stop the podman machine VM, an alternative would be to try to do the waiting and writing of the thread id in a go routine to avoid the blocking.
However, podman would need to be ready for that, and retry reading the file if it's missing, which is not the case at the moment.
With all that said, the current approach should be good enough for now.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cfergeau, lstocchi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
this commit fixes a potential race condition that prevented the tests to succeed when running in a github workflow.
Basically the thread id was not actually available before writing it on the file, resulting in a thread id equals to 0 written in it. So, when the tests were trying to retrieve the thread id to use it to send the WM_QUIT signal, they failed.
This patch adds a check on the thread id before writing it on the file. Now, if the thread id is 0, it keeps calling winquit to retrieve it. If, after 10 secs, there is no success it returns an error.
it resolves #432