Support raise_signal_exceptions #37

DanielStevenLewis · 2024-01-22T19:55:15Z

https://github.com/Betterment/delayed#migrating-from-delayedjob states "that some configurations, like queue_attributes, exit_on_complete, backend, and raise_signal_exceptions have been removed entirely." I think the lack of raise_signal_exceptions (and the reliance on the behaviour described in https://github.com/Betterment/delayed#running-a-worker-process) could prevent me from suggesting switching over from delayed_job to delayed. Would it be difficult to support raise_signal_exceptions and are there any concerns with the idea of supporting it?

jmileham · 2024-01-22T20:25:03Z

Can you say more about what your concerns are with the delayed behavior? Delayed's behavior prioritizes finishing jobs that have begun to the extent possible before worker shutdown in an attempt not to waste work and minimize job latency. It also leans into the assumption that not every job payload will have been implemented with ideal semantic idempotency. In our view having a more opinionated and curated worker drain/deployment process is an advantage, but would love to learn more about your context.

DanielStevenLewis · 2024-01-22T21:42:38Z

We currently use Delayed::Worker.raise_signal_exceptions = :term with delayed_job. I'm hoping that we can switch over to delayed with minimal work/changes needed, and thereby benefit from the performance enhancements it has, as a quick win.
We restart the job servers whenever we deploy (every few days), and we have jobs that take many hours to run. I'm concerned that without this configuration option, after a deployment we'd have jobs that would take a very long time before they can retry.

Thanks for asking @jmileham . Is there more information I should try provide to better speak to your question?

jmileham · 2024-01-22T21:47:42Z

So you're looking to switch to delayed but would need to extend the job timeout, and aren't looking to implement a long-lived draining period in your infra coordination right away? Makes sense. I'll tag out now because @smudge will have smarter thoughts about where to go from here.

DanielStevenLewis · 2024-01-22T21:49:52Z

Right! Thanks

smudge · 2024-01-29T17:04:34Z

I started looking into this on Friday, but I'll note that it's a little more complicated than simply adding the feature back. We removed it because it was incompatible with delayed's multithreading (where a single worker can claim & work off multiple jobs at once, configured via the max_claims option). Supporting raise_singal_exceptions in a way that would allow the individual job threads to rescue would require some extra signal-passing across threads that I haven't had a chance to explore yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support raise_signal_exceptions #37

Support raise_signal_exceptions #37

DanielStevenLewis commented Jan 22, 2024

jmileham commented Jan 22, 2024

DanielStevenLewis commented Jan 22, 2024

jmileham commented Jan 22, 2024

DanielStevenLewis commented Jan 22, 2024

smudge commented Jan 29, 2024

Support raise_signal_exceptions #37

Support raise_signal_exceptions #37

Comments

DanielStevenLewis commented Jan 22, 2024

jmileham commented Jan 22, 2024

DanielStevenLewis commented Jan 22, 2024

jmileham commented Jan 22, 2024

DanielStevenLewis commented Jan 22, 2024

smudge commented Jan 29, 2024