-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIO-interface: add Unhealthy container state #5307
base: main
Are you sure you want to change the base?
Conversation
docjyJ
commented
Sep 21, 2024
- Add Unhealty state
- Replace class by backed enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jean-Yves, Simon asked me to have a look at your changes. I left some feedback regarding the code itself.
I really like you refactoring the container state logic from an interface to an enum. However, I think that both concerns could have been split into separate PRs. One PR to refactor the interface and another one to implement the new healthy states. But the code is already there so why not.
Looks good otherwise but did not test.
Hi, Thanks for the answer, as soon as I have time I'll look into it. I will try to separate and make several PR. Take care, |
I keep this PR open for the unhealthy state. |
1f3f231
to
b9ca83a
Compare
Conflicts :/ |
Don't worry, i'll handle it. |
b9ca83a
to
6a3c340
Compare
Solved and up to date and ready (to be tested anyway...) |
I have podman on my machine and I can't launch the container... I can't test it and I don't have time to debug... |
Why may I ask? Do you run into an issue here? |
Thank, resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Looks good to me now!
Health checks may need to be adjusted, probably reduce the interval during startup to speed up startup of all containers.
See: https://docs.docker.com/reference/dockerfile/#healthcheck |
This should allow for better dependency management. |
Thanks for the idea! However our health checks are currently built in a way that they never fail after a specific time. See for example https://github.com/nextcloud/all-in-one/blob/main/Containers/apache/healthcheck.sh. for the rest the defaults are good enough imho |
The problem with doing this is that docker considers the container ready... This should immediately take the container out of the starting state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking, see above
This is the only reliable way to check if the container is really up and running. |
A fix would be to fail the container if nextcloud:9000 is not reachable and add a startup period (e.g. 10 minutes) #!/bin/bash
- nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
+ nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
nc -z 127.0.0.1 8000 || exit 1
nc -z 127.0.0.1 "$APACHE_PORT" || exit 1
if ! nc -z "$NC_DOMAIN" 443; then
echo "Could not reach $NC_DOMAIN on port 443."
exit 1
fi - HEALTHCHECK CMD /healthcheck.sh
+ HEALTHCHECK --start-period=10m CMD /healthcheck.sh |
This is the problem: we cannot spwcify a time here as is depending on the overall setup like installed apps and features, amount of users, given hardware and else especially during upgrades |
Yes, I see... Would it be better to manage dependencies like docker compose?
|
I fear this is going to have the same problem afaics, no? |
@docjyJ LGTM now 😊 Can you please rebase the PR and squash the commits? |
730f4f4
to
9482e43
Compare
Signed-off-by: Jean-Yves <[email protected]>
9482e43
to
4798489
Compare
Done |
public function isStarting(): bool { | ||
return $this == self::Starting; | ||
} | ||
|
||
public function isRestarting(): bool { | ||
return $this == self::Restarting; | ||
} | ||
|
||
public function isHealthy(): bool { | ||
return $this == self::Healthy; | ||
} | ||
|
||
public function isUnhealthy(): bool { | ||
return $this == self::Unhealthy; | ||
} | ||
|
||
public function isRunning(): bool { | ||
return $this->isHealthy() || $this->isUnhealthy() || $this->isStarting() || $this->isRestarting(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Running state corresponds to the old GetRunningContainerState state.
Maybe there is another clearer state. Healthy means that the container is running without any problem detected by the AIO.
@@ -28,7 +30,7 @@ private function PerformRecursiveContainerStart(string $id, bool $pullImage = tr | |||
|
|||
// Don't start if container is already running | |||
// This is expected to happen if a container is defined in depends_on of multiple containers | |||
if ($container->GetRunningState() === ContainerState::Running) { | |||
if ($container->GetContainerState()->isRunning()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
container->GetRunningState does not include Starting and Restarting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid it being hard to understand I went through the isSomething function
if ($responseBody['State']['Running'] === true) { | ||
return ContainerState::Running; | ||
} else { | ||
return ContainerState::Stopped; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true regardless of the Healthy or Starting or Running state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@docjyJ I discussed the latest changes with @st3iny and we agree that the latest changes do not make much sense or at least are not easy enough to merge them. I am very sorry for that.
I would like to ask you if you could restore the changes from b20e2a8 by running git checkout <hash>
and then push the old changes to a new branch via so: git checkout -b ench/noid/heathcheck-restored
. From that commit hash, I think only such a check like https://github.com/nextcloud/all-in-one/pull/5307/files#diff-502ad3fbc5a2714763795f78cf314d4a76ada0cb3746530f2fb86fa471dd897bR91-R105 is missing. (best in a new commit)
Can you please do the above? I would do it myself but I unfortunately don't have your changes locally available anymore