Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run queue tasks continuously #6062

Open
davidbarratt opened this issue Jul 14, 2024 · 24 comments
Open

Run queue tasks continuously #6062

davidbarratt opened this issue Jul 14, 2024 · 24 comments

Comments

@davidbarratt
Copy link

davidbarratt commented Jul 14, 2024

Is your feature request related to a problem? Please describe.
We use Drupal's Queue feature to handle background tasks. However, many of these tasks are time sensitive and need to be executed in the background as soon as possible.

Drush doesn't provide a way to run a specific queue as a daemon, it can only be invoked on a schedule.

Describe the solution you'd like
I would really like a way to run Drupal queue as a long running task in the foreground. Then I can use systemd, docker, or k8s to keep it running.

Maybe a queue:watch command that is like:

public function run(string $name, $options = ['time-limit' => self::REQ, 'items-limit' => self::REQ, 'lease-time' => self::REQ]): void
{
$time_limit = (int) $options['time-limit'];
$items_limit = (int) $options['items-limit'];
$start = microtime(true);
$worker = $this->getWorkerManager()->createInstance($name);
$info = $this->getWorkerManager()->getDefinition($name);
$end = time() + $time_limit;
$queue = $this->getQueue($name);
$count = 0;
$remaining = $time_limit;
$lease_time = $options['lease-time'] ?? $info['cron']['time'] ?? 30;
if ($queue instanceof QueueGarbageCollectionInterface) {
$queue->garbageCollection();
}
while ((!$time_limit || $remaining > 0) && (!$items_limit || $count < $items_limit) && ($item = $queue->claimItem($lease_time))) {
try {
// @phpstan-ignore-next-line
$this->logger()->info(dt('Processing item @id from @name queue.', ['@name' => $name, '@id' => $item->item_id ?? $item->qid]));
// @phpstan-ignore-next-line
$worker->processItem($item->data);
$queue->deleteItem($item);
$count++;
} catch (RequeueException) {
// The worker requested the task to be immediately requeued.
$queue->releaseItem($item);
} catch (SuspendQueueException $e) {
// If the worker indicates there is a problem with the whole queue,
// release the item.
$queue->releaseItem($item);
throw new \Exception($e->getMessage(), $e->getCode(), $e);
} catch (DelayedRequeueException $e) {
// The worker requested the task not be immediately re-queued.
// - If the queue doesn't support ::delayItem(), we should leave the
// item's current expiry time alone.
// - If the queue does support ::delayItem(), we should allow the
// queue to update the item's expiry using the requested delay.
if ($queue instanceof DelayableQueueInterface) {
// This queue can handle a custom delay; use the duration provided
// by the exception.
$queue->delayItem($item, $e->getDelay());
}
} catch (\Exception $e) {
// In case of any other kind of exception, log it and leave the
// item in the queue to be processed again later.
$this->logger()->error($e->getMessage());
}
$remaining = $end - time();
}
$elapsed = microtime(true) - $start;
$this->logger()->success(dt('Processed @count items from the @name queue in @elapsed sec.', ['@count' => $count, '@name' => $name, '@elapsed' => round($elapsed, 2)]));
}

but that is running in an infinite loop (with a usleep()?) until the time limit or item limit is reached? Basically, it should keep running even if there are zero items in the queue right now.

It might be nice to also limit the number of iterations by the amount of memory (i.e. memory_get_usage()) which would allow the script to be kept alive for as long as possible.

Describe alternatives you've considered

  • Running in a cron tab, but the most frequent you can set that to is once a minute.
  • Using a loop in a shell script that runs drush queue:run, but then we are doing the work to bootstrap Drupal and connect to the database on every iteration.

Related:

Additional context
N/A

@davidbarratt
Copy link
Author

davidbarratt commented Jul 15, 2024

I think this might be blocked on https://www.drupal.org/project/drupal/issues/2218651

The work-around is to bootstrap Drupal with each queue run, which isn't the worst, but I can put drush queue:run in a shell script with an infinite loop and a sleep and I think that's probably the best option for now.

@Chi-teck
Copy link
Collaborator

I would really like a way to run Drupal cron or a specific queue as a long running task in the foreground.

Cron as long running task sounds weird. Cron is meant for scheduled tasks. If you have a need to constantly process some large data, i.e. import products from external sources it's better to create a worker which could be an ordinary Drush command. Furthermore, even for scheduled custom tasks I would not mess with hook_cron but create a separate Drush command and put it to crontab like follows.

1 * * * * ./vendor/bin/drush example >> ./var/log/cron/exampe.log 2>&1

@davidbarratt
Copy link
Author

Ah that's right, I forget that modules can run tasks directly on cron run so maybe just having a long-running queue runner is actually what I want.

@davidbarratt davidbarratt changed the title Run cron / queue tasks continuously Run queue tasks continuously Jul 15, 2024
@davidbarratt
Copy link
Author

@Chi-teck I updated the description, I hope this is helpful. it's honestly probably just an option on the existing command tbh

@jonpugh
Copy link
Contributor

jonpugh commented Sep 12, 2024

I was tinkering with using (Advanced) Queue API for running server side commands. I thought there was a daemon!

FWIW... This is how Aegir does it, with a Drush 8 command "hosting-queued": https://git.drupalcode.org/project/hosting/-/blob/7.x-3.x/queued/hosting_queued.drush.inc?ref_type=heads#L52-214

Would this be useful? https://github.com/mac-cain13/daemonizable-command

Gonna look into this...

@webflo
Copy link
Contributor

webflo commented Oct 11, 2024

I implemented something similar as a command some time ago. However, I don't currently use it. https://github.com/ueberbit/drush_queue_daemon

@claudiu-cristea
Copy link
Member

Not sure if this module I've built some time ago could help https://www.drupal.org/project/recurring_task

@pierreabreup
Copy link

I implemented something similar as a command some time ago. However, I don't currently use it. https://github.com/ueberbit/drush_queue_daemon

@webflo I liked your implementation, and I've tested it on my machine. If you don't mind explaining, why have you stopped to use that solution?

@webflo
Copy link
Contributor

webflo commented Oct 20, 2024

@pierreabreup I no longer work for the client. But I think the code is still in use. I had no problems with it. I would use it again.

@DieterHolvoet
Copy link
Contributor

I also built a lightweight (doesn't bootstrap Drupal unless necessary) queue runner that's currently running on multiple production sites: https://gist.github.com/DieterHolvoet/2ea792445f3498278e82221ab198288e

@Chi-teck
Copy link
Collaborator

@DieterHolvoet I suspect this eats 100% of CPU.

while ($item = $queue->claimItem())

It needs some sleep on each iteration.

@DieterHolvoet
Copy link
Contributor

It needs some sleep on each iteration.

Why's that?

@Chi-teck
Copy link
Collaborator

Never mind. It'll claims items unless the queue is empty.

However, I think it'll not much better than drush queue:run in terms of performance Because @bootstrap database is also quite expensive procedure.

@Chi-teck
Copy link
Collaborator

Also restarting a process once a second could eat some resources.

@Chi-teck
Copy link
Collaborator

@DieterHolvoet why not just put sleep(1) into the foreach loop and make it true long running process?

@DieterHolvoet
Copy link
Contributor

To prevent memory leaks, or just memory piling up in general through e.g. static (entity) caches. Drupal so far isn't really optimized to be kept running for a long time, right? I'm not sure what is more performant, I'm also not saying my solution is the best one. All I can say is that it hasn't blown up any servers so far :) Would be interesting to compare the impact of different kinds of solutions.

@Chi-teck
Copy link
Collaborator

Drupal so far isn't really optimized to be kept running for a long time, right?

I would say it's pretty good with it. Typically, memory leaks because developers do not care about it. For instance, for static entity cache the '@entity.memory_cache' service can help to delete unused cache items from the cache. And that's actually needs to be done in queue plugin that populated that cache not in the Drush command. Inside the Drush command you may implement sort of memory guard. It's not a big deal to check with memory_get_usage() if memory exceeded limit and exit from the endless loop allowing systemd to restart the worker. I vaguely remember that Drush itself had used such approach some time ago.

@Chi-teck
Copy link
Collaborator

To summarize PHP does not leak at all, at least in latest versions. Leaks mostly caused by custom code. However, small leaks like a few kB per day I would not even bother to debug. For the reference, a simple Drush command that does nothing consumes about 70 MB itself.

Note that systemd is apparently does not have a way to gracefully restart a service when it exceed the configured memory limit. However, there are other process managers that do have have such an option. For instance, RoadRunner and Supervisord (via plugin).

@Chi-teck
Copy link
Collaborator

Chi-teck commented Nov 22, 2024

To prevent memory leaks, or just memory piling up in general

@DieterHolvoet that may happen with your current implementation as well. Memory does not leak when a process is idle. It will likely leak when it does something (i.e. $queueWorker->processItem($item->data)). That's why Drush queue:run command has --items-limit option. In your case if the queue is large enough OOM killer will visit the worker process...

@DieterHolvoet
Copy link
Contributor

@DieterHolvoet I suspect this eats 100% of CPU.

Turns out you're right, this has been what's slowing down my server. Thanks for pointing that out. I'm going to look into the alternatives.

@DieterHolvoet
Copy link
Contributor

I have a suggestion for a way to solve this problem in a generic way, that could be added to Drush core: add a queue:run-all command that does the same as queue:run, but for all available queues instead of a specific one. It would take the same options, but without the argument. The limit options would count across all queues instead of a specific queue.

I also want to suggest to add a --memory-limit option that takes either a number that ends with g/m/k (see drush_memory_limit()) or a percentage. The command will run until that amount of memory is consumed. This option can be added to both queue:run and queue:run-all.

I also want to suggest to add a --daemon option that keeps the command running until items-limit, lease-limit or memory-limit is reached - if defined. Else, it keeps going indefinitely. This option can be added to both queue:run and queue:run-all.

@weitzman what do you think, is this something you would agree to add to Drush core? I think it's a good idea, except for the --daemon option it doesn't really introduce any new concepts, it just complements the commands that are already in Drush.

@weitzman
Copy link
Member

weitzman commented Jan 5, 2025

I don't see much value in a command that runs different queues. That's a bit different from what we have been discussing here.

As for a daemon, see the command that webflo linked. Would that work? I'm not too inclined to put that in core drush but if we had a queue maintainer I would consider it. We should look into the revolt lib that Drupal core adopted https://www.drupal.org/project/drupal/issues/3394423

@DieterHolvoet
Copy link
Contributor

I don't see much value in a command that runs different queues. That's a bit different from what we have been discussing here.

Ah, you're right, I didn't notice the issue is talking about processing a single queue. I guess that makes a queue:run-all command a separate discussion. I still think it's valuable though, I personally don't want to have to set up a separate worker per queue. I feel like it's only a matter of time before one will be forgotten. A queue:run-all command seems useful, just like the cron command processes all available cron jobs and queues.

As for a daemon, see the command that webflo linked. Would that work?

That one restarts every X seconds. With my proposal, there would be more control on how often the process is restarted.To keep it running until a certain level of memory usage is reached seems ideal to me. Also, it introduces a dependency on ReactPHP and I don't really see the added value of that here. My proposed implementation would be a lot simpler.

What do you think about the proposed --memory-limit option?

We should look into the revolt lib that Drupal core adopted

I personally have very little experience and even understanding of this kind of async PHP. I feel like we're still far away from being able to use this in Drupal, so maybe for now we could focus on something simpler that also gets the job done? I do think it's something interesting to take a look at in the future though.

@DieterHolvoet
Copy link
Contributor

I'm not too inclined to put that in core drush but if we had a queue maintainer I would consider it.

I would be interested in that. I work a lot with queues when doing project work for my agency, so I'm invested in improving those commands + it's not like the field commands take up that much of my time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants