Replies: 16 comments 34 replies
-
I am facing with the same situation (1 dropped out of 5) for |
Beta Was this translation helpful? Give feedback.
-
Also having the same problem here, with the Actions Runner Controller. I suspect the API is having issues (the managed runners are behaving similarly poorly) but the status page is not really reflecting reality. |
Beta Was this translation helpful? Give feedback.
-
I am also having exactly same problem.Please can someone advise? |
Beta Was this translation helpful? Give feedback.
-
any updates, this issue is blocking a lot of things for us |
Beta Was this translation helpful? Give feedback.
-
Experiencing the same thing, restarting the runner service doesn't change anything, recreating the runner fixes the issue, but it's a hassle if the runners keep going into idle state and have to be recreated often. Still don't know the cause. |
Beta Was this translation helpful? Give feedback.
-
I think it was working fine until version 2.304 because there were no issues with that version.Github forces updating the version even if it is not tested properly |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
Recreating a runner makes other runners pick up the jobs automatically. This is very difficult to understand |
Beta Was this translation helpful? Give feedback.
-
2 runners started to pick up jobs now. So instead of 4 out of 8 not working, it improved to 2 out of 8 not working, but I didn't change anything. Probably GitHub is working on something to fix it. The runners are updated to |
Beta Was this translation helpful? Give feedback.
-
I think it was working fine until version 2.304 because there were no issues with that version.Github forces updating the version even if it is not tested properly |
Beta Was this translation helpful? Give feedback.
-
Same error here, after (auto) updating from 2.315 to 2.316. Reinstalling runners (Current runner version: '2.316.0') works and now they are processing jobs again. |
Beta Was this translation helpful? Give feedback.
-
@jcahigal but it doesn't last long, it would again fail to pick up jobs if you run it again |
Beta Was this translation helpful? Give feedback.
-
Do you guys use these self-hosted runners most of the time? |
Beta Was this translation helpful? Give feedback.
-
I have 105 self-hosted runners most running continuously every day. I had two waves of runners going idle, probably related to when they updated. All seems to be in order after the second round of recreating the runners. All runners are currently working as expected. |
Beta Was this translation helpful? Give feedback.
-
Have this on 2.317.0, sometime around late June our runner have just stopped picking up jobs. The repo is private, all config.sh checks pass, removing and adding the runner again did not help. |
Beta Was this translation helpful? Give feedback.
-
I feel that the confusion arises from the workflow trying to determine which runner to choose, as they all have the same labels and belong to the same group. So I suggest that an additional label (like a tag) should be added to distinguish between similar runners |
Beta Was this translation helpful? Give feedback.
-
Select Topic Area
Bug
Body
Problem
I have 8 self-hosted runners that are configured the same. 4 of them started to stop working yesterday (22 Apr) around 08:44 UTC.
When a workflow run, 4 of the workers are active picking up jobs, the other 4 stays idle as if there is no job. The 4 runners are not offline, they are marked as online, connected, idle, yet not picking up jobs. The expected behaviour is all 8 of them can pick up jobs.
It is not related to any concurrency limit, it's consistently that specific 4 runners not picking up the jobs across different repos in the organization.
What it looks like
The runners page in the organization, 4 active, 4 idle:
But at the same time, there are plenty of checks waiting for runner to pick up:
About the runners
2.315.0
self-hosted
and in the same runner groupDefault
I couldn't figure out why 4 of them works, why 4 of them suddenly stopped working.
What I have tried to fix it
sudo ./svc.sh stop
and thensudo ./svc.sh start
. This does mark the service as offline and then idle again, but still not picking up jobssudo reboot
, similarly the service is offline then online but not picking up jobs./run.sh --check --url <org_url> --pat <my_pat>
, all passedChecking the runner logs in
_diag
directory, only this error since it stopped workingChecking the last worker logs in
_diag
directory, nothing specialChecking
journalctl
, no errors logged, as if there is no new job ever requestedChecking their versions,
2.315.0
, all at the latest versionDocker is installed and active (
sudo systemctl is-active docker.service
)Anyone faced this before? Any clue what else can I check? Would this be a bug in GitHub side? Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions