Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support raise_signal_exceptions #37

Open
DanielStevenLewis opened this issue Jan 22, 2024 · 5 comments
Open

Support raise_signal_exceptions #37

DanielStevenLewis opened this issue Jan 22, 2024 · 5 comments

Comments

@DanielStevenLewis
Copy link

https://github.com/Betterment/delayed#migrating-from-delayedjob states "that some configurations, like queue_attributes, exit_on_complete, backend, and raise_signal_exceptions have been removed entirely." I think the lack of raise_signal_exceptions (and the reliance on the behaviour described in https://github.com/Betterment/delayed#running-a-worker-process) could prevent me from suggesting switching over from delayed_job to delayed. Would it be difficult to support raise_signal_exceptions and are there any concerns with the idea of supporting it?

@jmileham
Copy link
Member

Can you say more about what your concerns are with the delayed behavior? Delayed's behavior prioritizes finishing jobs that have begun to the extent possible before worker shutdown in an attempt not to waste work and minimize job latency. It also leans into the assumption that not every job payload will have been implemented with ideal semantic idempotency. In our view having a more opinionated and curated worker drain/deployment process is an advantage, but would love to learn more about your context.

@DanielStevenLewis
Copy link
Author

We currently use Delayed::Worker.raise_signal_exceptions = :term with delayed_job. I'm hoping that we can switch over to delayed with minimal work/changes needed, and thereby benefit from the performance enhancements it has, as a quick win.
We restart the job servers whenever we deploy (every few days), and we have jobs that take many hours to run. I'm concerned that without this configuration option, after a deployment we'd have jobs that would take a very long time before they can retry.

Thanks for asking @jmileham . Is there more information I should try provide to better speak to your question?

@jmileham
Copy link
Member

So you're looking to switch to delayed but would need to extend the job timeout, and aren't looking to implement a long-lived draining period in your infra coordination right away? Makes sense. I'll tag out now because @smudge will have smarter thoughts about where to go from here.

@DanielStevenLewis
Copy link
Author

Right! Thanks

@smudge
Copy link
Member

smudge commented Jan 29, 2024

I started looking into this on Friday, but I'll note that it's a little more complicated than simply adding the feature back. We removed it because it was incompatible with delayed's multithreading (where a single worker can claim & work off multiple jobs at once, configured via the max_claims option). Supporting raise_singal_exceptions in a way that would allow the individual job threads to rescue would require some extra signal-passing across threads that I haven't had a chance to explore yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants