Support for caching #1142
Unanswered
fernandocamargoai
asked this question in
Ideas
Replies: 1 comment
-
Hi @fernandocamargoti - I think that is a great idea! It is a bit challenging to implement this with the current architecture since every batch's input and output are transferred between the frontend batching layer and model backend in one HTTP request. But @bojiang and I have been discussing moving this to a |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, guys.
First, I'd like to congratulate you on the 0.9.0 release. I really liked the discard() mechanism and now I'm able to validate using Pydantic easily.
I'd like to propose a new feature, using this new API, that is to early return a value. Right now, I have a cache implemented using Redis. But since I don't have a way to return the cached value early, if I'm processing a micro-batch and some of the requests have cached responses, I can avoid processing those requests again, but the whole micro-batch needs to be processed first.
So, my idea is that, similar to the current task.discard(), we would have a task.return() where we return a value. And this value would be returned early, instead of waiting for the whole micro-batch to be processed.
What do you guys think?
Beta Was this translation helpful? Give feedback.
All reactions