Queue-Worker System #123

AIApprentice101 · 2024-01-22T15:56:04Z

Thank you for the great package. I'm interested in hosting an LLM on GKE.

For our existing ML applications, we usually implement a queue-worker system (e.g. redis-queue or redis-celery) to handle long-running background tasks. Does ray-llm have a similar feature implemented under-the-hood? Or do I need to set it up myself?

sihanwang41 · 2024-01-22T18:04:53Z

Hi @AIApprentice101, We don't have the functionality in the ray-llm, you have to set it up by yourself.

For redis solution, do you see any issue or pain points? Or it is more about the integration effort.

AIApprentice101 · 2024-01-23T22:06:51Z

@sihanwang41 Thank you for your reply. I saw there's a RFC related to integration of queuing system in Ray serve: ray-project/ray#32292. So I was wondering if that's something Ray-LLM would consider, especially given the inference of LLM usually takes pretty long to run.

In the meantime, we can set up the queuing system ourselves.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queue-Worker System #123

Queue-Worker System #123

AIApprentice101 commented Jan 22, 2024

sihanwang41 commented Jan 22, 2024

AIApprentice101 commented Jan 23, 2024

Queue-Worker System #123

Queue-Worker System #123

Comments

AIApprentice101 commented Jan 22, 2024

sihanwang41 commented Jan 22, 2024

AIApprentice101 commented Jan 23, 2024