Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: leverage retry-action in build step as well #506

Closed
wants to merge 4 commits into from

Conversation

dylanmtaylor
Copy link
Contributor

@dylanmtaylor dylanmtaylor commented Feb 19, 2024

This is a different variant of the other PR (#503) but with the build step having a retry on it as well.

@bsherman - I think this might be the better version of the other PR, as it had a sporadic build failure just from the PR's CI running.

…major steps

This is a different variant of the other PR but with the build step having a retry on it as well.
Copy link
Contributor

@bsherman bsherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things:

  1. that "@master" version came back :-)
  2. I really dislike the yaml spaces workaround, though I confess I struggle to articulate a strong technical argument, only that it seems a like a hack.
  3. Despite the apparent win of always retrying on build image step failures... my concern is we'll be retrying and not fixing underlying issues. Also, I think there's another solution to the specific issue of failures pulling the "FROM" image which is less brute force.

Generally, I'm happy to see iterative improvements. So, I'd be happy to see #503 pushed as is since we agree on those bits. Then, we can work on another iteration specifically targeting the build_image step.

However, if you want I can push a commit to that PR to demonstrate my idea.

@dylanmtaylor
Copy link
Contributor Author

dylanmtaylor commented Feb 20, 2024

A few things:

1. that "@master" version came back :-)

2. I really dislike the yaml spaces workaround, though I confess I struggle to articulate a strong technical argument, only that it seems a like a hack.

3. Despite the apparent win of always retrying on build image step failures... my concern is we'll be retrying and not fixing underlying issues. Also, I think there's another solution to the specific issue of failures pulling the "FROM" image which is less brute force.

Generally, I'm happy to see iterative improvements. So, I'd be happy to see #503 pushed as is since we agree on those bits. Then, we can work on another iteration specifically targeting the build_image step.

However, if you want I can push a commit to that PR to demonstrate my idea.

I thought I fixed the @master version. That one is on me.
I agree on #2 and really, really want to figure out a cleaner way to do that.
For #3, if a simple retry or two does not result in a success, we will still fail the job. That would prompt us to fix underlying issues. I would much rather have a successful image get generated than worry about running the job a few times.

I added an output fix to #503 and I think it can go in now. I can then rebase this PR on that for future discussion.

@dylanmtaylor dylanmtaylor changed the title feat: leverage retry-action to increase reliability of builds on all major steps feat: leverage retry-action in build step as well Feb 20, 2024
@bsherman
Copy link
Contributor

  1. Despite the apparent win of always retrying on build image step failures... my concern is we'll be retrying and not fixing underlying issues. Also, I think there's another solution to the specific issue of failures pulling the "FROM" image which is less brute force.

For #3, if a simple retry or two does not result in a success, we will still fail the job. That would prompt us to fix underlying issues. I would much rather have a successful image get generated than worry about running the job a few times.

Having had more time to use workflows with the retry action, I am more confident that I do not want retry action wrapping the build step for our jobs.

The retry action is very useful where there are known spurious failures, so it's great for our pull/push steps as being used now. However, it complicates debugging as well as any inputs/outputs to/from the wrapped step. The build step is exactly where we need maximum visibility and easy of use with regard to inputs/outputs. Also, now that we are pulling in sourced images before building, spurious failures in build step have nearly ceased to occur.

Closing this as we won't be adding retry action to build step.

@bsherman bsherman closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants