Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for retrying copa patch on failure/timeout #49

Open
SaptarshiSarkar12 opened this issue Sep 7, 2024 · 6 comments · May be fixed by #50
Open

feat: Add support for retrying copa patch on failure/timeout #49

SaptarshiSarkar12 opened this issue Sep 7, 2024 · 6 comments · May be fixed by #50
Assignees

Comments

@SaptarshiSarkar12
Copy link

Problem

Due to network issues, copa patch fails (timed out) often. It is tedious to re-run GitHub actions jobs on failure each time.

Solution proposed

We can add an optional GitHub actions input - max_attempts which would store the number of times the copa patch would be run if it failed. Moreover, at the last the workflow can print the number of attempts before it succeeded.

Additional Information

I would like to work on this issue if the maintainers approve this issue.

@ashnamehrotra
Copy link
Contributor

@SaptarshiSarkar12 thanks for the issue! I think a better solution to timeout error would be changing the timeout arg in copa which we already support through the copa action. If the timeout is not changed from default (5 min) and copa patch is failing due to network, running it multiple times will still result in the same error.

@SaptarshiSarkar12
Copy link
Author

@SaptarshiSarkar12 thanks for the issue! I think a better solution to timeout error would be changing the timeout arg in copa which we already support through the copa action. If the timeout is not changed from default (5 min) and copa patch is failing due to network, running it multiple times will still result in the same error.

@ashnamehrotra I have changed the timeout to 10 mins, but it still reports timeout error. I am currently using retry-action which seems to handle the issue properly by re-running the step when it fails. But that is a 3rd party solution. I was looking for an official solution from Copa.
You can check these workflow runs where that retry action has automated the re-run of the failed step 👇

and many more.
I hope these workflow runs are adequate to prove the essence of adding this feature. Please let me know your views on it.

@ashnamehrotra
Copy link
Contributor

@SaptarshiSarkar12 thats interesting, out of curiosity, if you use the default and remove docker/[email protected], does it still result in the same timeout error?

@SaptarshiSarkar12
Copy link
Author

@SaptarshiSarkar12 thats interesting, out of curiosity, if you use the default and remove docker/[email protected], does it still result in the same timeout error?

@ashnamehrotra They work totally fine, but the overall workflow fails because cache export is not supported by the current docker buildx installed by default.
But the timeout problem occurs in dev-docker-build workflow not for docker-publish workflow as the patch of oraclelinux starts at different time with a difference of approximately 2 mins.

@ashnamehrotra
Copy link
Contributor

@SaptarshiSarkar12 got it, I will assign this issue to you thanks!

@SaptarshiSarkar12
Copy link
Author

Thank you @ashnamehrotra for assigning the issue to me 😄.

@SaptarshiSarkar12 SaptarshiSarkar12 linked a pull request Sep 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants