Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small poll interval #486

Closed
hillac opened this issue Aug 22, 2024 · 5 comments
Closed

Small poll interval #486

hillac opened this issue Aug 22, 2024 · 5 comments
Labels
question Further information is requested

Comments

@hillac
Copy link

hillac commented Aug 22, 2024

I want to use this mostly to schedule future jobs that need to run on the second. If I have many workers and a short pollInterval that will be hammering the db right? So at pollInterval 0.5s, would I have 2n queries per second for n workers? Would it make more sense to use pg_cron to run the query so it only runs once per interval and send out a NOTIFY when there's a result? Or is the interval query already debounced for multiple workers?

Alternatively, do you think it would make sense to set the job time as 5 seconds before it actually needs to run, put the actual time t in the payload, and just have the executor start with a t minus now async timeout before the actual job? That way poll internal could be 5, and I would get very accurate job timing.

Thanks!

@benjie benjie added the question Further information is requested label Aug 22, 2024
@benjie
Copy link
Member

benjie commented Aug 22, 2024

With current versions you'll get 2 * number_of_worker_instances * concurrency queries per second... Not ideal! With the @canary release (which has a timing bug that we've not managed to track down yet), it will only be 2 * number_of_worker_instances queries per second, except if your queue is empty (or there are no jobs to execute) in which case it may spike higher.

Your alternative approach sounds much more sensible, depending on your load and throughput requirements, simply executing the jobs ahead of time and waiting until the target time to execute the main logic. You should make sure you have sufficient concurrency for this.

Graphile Worker is not currently designed for sub-second accuracy for delayed execution, and I'm not sure it ever will be - the point of a background job queue is to be able to execute jobs in the background (and be able to reattempt them as necessary), jobs queued should only be those that can happily execute seconds to minutes later without much problem, for example in the case of having more jobs than workers available to execute them.

@benjie benjie closed this as completed Aug 22, 2024
@hillac
Copy link
Author

hillac commented Sep 4, 2024

Thanks for the reply. Is there an open issue for the canary timing bug somewhere? I couldn't see anything in issues.

@benjie
Copy link
Member

benjie commented Sep 4, 2024

No; it was reported via the Discord.

@hillac
Copy link
Author

hillac commented Oct 13, 2024

Any update on that timing bug? Mind linking the discord comment so I can see?

@benjie
Copy link
Member

benjie commented Oct 13, 2024

Turns out it was reported on the PR: #474 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants