-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create datasetBlockages collection + block datasets #2933
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all good to me, though I'm concerned that it might slow down the next_waiting_job query if there are blocked datasets with thousands of their jobs in the queue (it will have to iterate over them all to filter them and get back a job from an unblocked dataset).
Anyway if it's rhe case it can be fixed by defining a target_start_time
that is equal to created_at
by default and that can be modified to later values (e.g. now + 1h) if a dataset is blocked
yes, I thought the same: I just changed to remove the short jobs (they should not affect too much anyway) + store as int, not float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I would suggest creating or adding some tests with memory limits to ensure that it won't affect too much the next_waiting_job function.
5c0cea7
to
b0cb05f
Compare
…elow JOB_DURATION_MIN_SECONDS
b0cb05f
to
0506abf
Compare
Hmmm I think it will have nearly no impact, since the collection of blocked datasets shouldn't have more than <10 entries, if everything works well. The only "expensive" operation we add is when we compute the sum of the durations (that I optimized a bit with #2933 (comment)), but it's run on job termination, not on job start. |
The CI error is due to issues on the Hub. Merging |
We apply rate limiting on the jobs, based on the total duration in a window (see #2279 (comment)).
Follows #2931