Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: prevent Tasks in terminal state from leaking PVCs #93

Merged
merged 5 commits into from
Oct 30, 2024

Conversation

adrian-vaca-humanes-wt
Copy link
Contributor

Running in EKS and GKE with otherwise default storage configuration, we have observed leaked PVCs. This change ought to prevent that from happening by eagerly deleting them.

@adrian-vaca-humanes-wt adrian-vaca-humanes-wt requested a review from a team as a code owner September 27, 2024 11:51
Copy link
Member

@strieflin strieflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great, if you could add a test case that verifies that PVC are actually deleted for failed tasks.

}
found := &v1.PersistentVolumeClaim{}
err := r.Get(ctx, name, found)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more idiomatic Go, if you woul remove the nesting and check for a non-nil error after each operation, i.e., Get and Delete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is more to do to become idiomatic, i.e.,

err := r.Get(ctx, name, found)
if err == nil {
...
}
return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)

should be

err := r.Get(ctx, name, found)
if err != nil {
  return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)
}
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, in rebasing I realized I had a local commit that I might have forgotten to push before re-requesting the review; so obviously it had not changed for you!

Copy link
Member

@strieflin strieflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a strange error (invalid golang version, see wokflow output) in the CI pipeline. But that seems to be a regression. Will investigate that.

Copy link
Member

@sbckr sbckr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @adrian-vaca-humanes-wt

Thank you for submitting your first PR to Carbyne Stack!

Please add yourself to the NOTICE.md file and have a look at the additional comments provided.

@adrian-vaca-humanes-wt
Copy link
Contributor Author

It would be great, if you could add a test case that verifies that PVC are actually deleted for failed tasks.

I will try to get to it during the week, thanks for the reviews!

@strieflin
Copy link
Member

It would be great, if you could add a test case that verifies that PVC are actually deleted for failed tasks.

I will try to get to it during the week, thanks for the reviews!

Thanks. Please let us know when we should have another look by re-requesting our review.

@adrian-vaca-humanes-wt
Copy link
Contributor Author

I didn't get the error that go.mod triggers for the tests in the CI in my machine when running make test.

@strieflin
Copy link
Member

There is also a strange error (invalid golang version, see wokflow output) in the CI pipeline. But that seems to be a regression. Will investigate that.

This was a nasty one. See #94 for the solution. My takeaway: Always pin down all dependencies; "latest" is evil 😈. @adrian-vaca-humanes-wt can you please rebase to master.

}
found := &v1.PersistentVolumeClaim{}
err := r.Get(ctx, name, found)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is more to do to become idiomatic, i.e.,

err := r.Get(ctx, name, found)
if err == nil {
...
}
return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)

should be

err := r.Get(ctx, name, found)
if err != nil {
  return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)
}
...

Copy link
Member

@strieflin strieflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small final thing then we are through it from my point of view.

found := &v1.PersistentVolumeClaim{}
err := r.Get(ctx, name, found)
if err != nil {
return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the message here is a bit misleading. I would rather suggest: "to be deleted persistent volume claim not found for task %v: %w"

if err != nil {
return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)
}
logger.V(logging.DEBUG).Info("Persistent Volume Claim already exists")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logging statement can be deleted. The log entry of line 392 is enough to cover the happy path.

return fmt.Errorf("persistent volume claim deletion failed for task %v: %w", key, err)
}

logger.V(logging.DEBUG).Info("Deleted Persistent Volume Claim for task %v", key)
Copy link
Member

@strieflin strieflin Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test fails (see https://github.com/carbynestack/klyshko/actions/runs/11569275970/job/32208185307) here as it should be:

logger.V(logging.DEBUG).Info("Deleted Persistent Volume Claim for task")

The task identifier is injected by the logger already (see line 375).

strieflin
strieflin previously approved these changes Oct 29, 2024
Copy link
Member

@strieflin strieflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@sbckr sbckr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM but @adrian-vaca-humanes-wt you should still add yourself to the NOTICE.md file 😃.
(Grouped by company; alphabetical order)

@adrian-vaca-humanes-wt
Copy link
Contributor Author

Code LGTM but @adrian-vaca-humanes-wt you should still add yourself to the NOTICE.md file 😃. (Grouped by company; alphabetical order)

Done!

P.S. Had to force-push to fix the signatures for the DCO check.

Copy link
Member

@sbckr sbckr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@strieflin
Copy link
Member

Hi @adrian-vaca-humanes-wt, there is still one commit that is not signed, i.e., 2850294. This commit is from me and is there on master already, where it is signed and signed off. How did you bring that commit into your branch? Seems not by means of a rebase. Could you please fix that?

@adrian-vaca-humanes-wt
Copy link
Contributor Author

Hi @adrian-vaca-humanes-wt, there is still one commit that is not signed, i.e., 2850294. This commit is from me and is there on master already, where it is signed and signed off. How did you bring that commit into your branch? Seems not by means of a rebase. Could you please fix that?

I did a rebase on master after checking it from upstream. If it's possible to retain your signature, I really don't know how.

@strieflin
Copy link
Member

Hm. After the rebase my commit should not be part of your PR commit history at all, right? You could try to drop that commit and then do the rebase again. Be sure that it is really a rebase and not a merge commit. If that doesn't work, a workaround would be to branch of from master again, redo/replay your changes, and opening a new PR.

@adrian-vaca-humanes-wt
Copy link
Contributor Author

Hm. After the rebase my commit should not be part of your PR commit history at all, right? You could try to drop that commit and then do the rebase again. Be sure that it is really a rebase and not a merge commit. If that doesn't work, a workaround would be to branch of from master again, redo/replay your changes, and opening a new PR.

Hopefully it's good now!

And you are right, I might have pulled before pushing making the rebase effectively a merge then.

Copy link
Member

@sbckr sbckr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@strieflin strieflin merged commit aeaa058 into carbynestack:master Oct 30, 2024
8 checks passed
strieflin pushed a commit that referenced this pull request Oct 31, 2024
prevent tasks in terminal state from leaking PVCs

Signed-off-by: Adrian Vaca Humanes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants