-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to retry failing specs on another worker? #157
Comments
No. Even if there was such a mechanism there might be edge cases. When could we consider that all parallel jobs completed work properly? Let's say you have a problematic CI job that has the firefox process failing and you can't run tests there. Let's say tests are put back in the Queue in knapsack_pro Queue Mode so that other parallel CI jobs can consume it. Most likely there are ways to solve this edge case and maybe force CI jobs to wait for some time till all test files are acknowledge by CI jobs so that we know all jobs executed tests. There would be also edge case when tests are not acknowledge by CI jobs and we would have to handle that as well - maybe with some time out. Probably there are more edge cases we need to consider as well. These are some on the top of my head. Most likely there is no simple solution right a way to handle this. We would have to collect more feedback from other users if they would find it useful to auto-assign tests to other jobs when the CI job can't run tests and then try to find a simple solution as possible to avoid edge cases. The simplest action for you to take for now could be:
What is your organization ID or email? You can send it to [email protected] and I can review your account. |
Thanks @ArturT - sorry for the delay, this slipped my radar. This all makes sense to me, thanks for detailing out everything 👍 . For our specific case:
I'll send in that email and try to find a few examples of runs exhibiting the crashes, thanks! |
I'm pasting here idea that might be useful for others looking at this issue: You could collect test file paths of failing tests from all parallel nodes and generate file Use this list of test files |
storyThe idea of running failed tests on another worker/CI node is part of the idea of improving Queue API. |
We're not running into the original problem I posted about anymore (feel free to close this issue), but as an FYI - circleci has started experimental support for "rerun failed tests only". We're trying this out and here's how we combine circleci failure retries with knapsack:
|
@MarkyMarkMcDonald Thanks for sharing the example. |
SOLUTIONHere is an example of how to rerun only failed tests on CircleCI: |
We use rspec-retry to deal with flakey browser specs. Unfortunately we're currently running into an issue where firefox / selenium / geckodriver crashes and is unrecoverable (we believe it's related to docker memory / too many file handles). All the retry attempts of the spec fail, and any further specs that use the browser also fail.
We're working towards solving the underlying issue, but is there a mechanism with knapsack to retry failed specs on another worker?
We've had a few other issues that have caused an entire worker to be inoperable (like postgres crashing and specs not waiting for it to recover), so I think this is applicable to more than just our current issue.
The text was updated successfully, but these errors were encountered: