Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust process management #12

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pcrock-thmdo
Copy link
Contributor

When the container receives a signal, we manually forward those signals to child processes with a bit of bash trap logic.

(Only SIGINT and SIGTERM... other signals are ignored)

This is a bit hacky, and really requires that we know what we're doing.

This change replaces that bash hackery with a proper 🧐 process supervisor, tini.

Full explanation in doc/process-management.md -- that's probably the best place to start looking at this PR.

testing

recommended that we do a full test with a couple of our containers in heroku before merging this. but to test locally:

in one terminal, build and start your container:

make python
docker run --rm -it \
  --env PORT=8080 \
  --env APP_PORT=8000 \
  --env SIGSCI_ACCESSKEYID=$SIGSCI_ACCESSKEYID \
  --env SIGSCI_SECRETACCESSKEY=$SIGSCI_SECRETACCESSKEY \
  localhost/thermondo-sigsci:python-3.12 \
  python3 -m http.server

in another terminal, attach to the running container and start sending signals to various processes to see what happens:

docker exec -it "$(docker container ls --latest --quiet)" /bin/bash

# install `ps`, `kill`, etc.
apt update && apt install --yes procps

# see what processes are running
ps -e

# send signals to those processes
kill 1

if you try to kill the bash process, nothing will happen. but if you kill python3, tini, or sigsci-agent, then the container should exit.

alternatively, instead of docker exec, you can try stopping the container:

docker stop "$(docker container ls --latest --quiet)" --time 999

if the above command stops the container immediately, then you know it works. if it stops only after 999 seconds, then there's a problem.

@pcrock-thmdo pcrock-thmdo self-assigned this Mar 22, 2024
@pcrock-thmdo pcrock-thmdo changed the title Simplify process management More robust process management Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant