Use initializer to ensure that all worker run worker_setup() #1919
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
Closes #1918
Description
When creating the multiprocessing pool we want to have each work run the
worker_setup()
function which ensure that it is started, and that it has the operation modules loaded.For low core counts calling
pool.map()
usually did the job. But it does not guarantee that each task goes to a different worker. So on machines with large numbers of CPU its possible for some workers to run the task multiple times, and others to not run it at all.Instead use the
initializer
which runs a task on each worker."If initializer is not None then each worker process will call initializer(*initargs) when it starts." --
https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
We no longer need to pass a dummy argument to
setup_worker()
.Testing & Acceptance Criteria
I can't reliably trigger this issue, even on the large 60-core IDAaaS workspaces.
Best is to check that it does not make things worse. I.e. check that MI starts up and an operations such as Rotate Stack work.
Documentation
Release notes