Support for caching #1142

fernandocamargoai · 2020-09-29T12:17:19Z

fernandocamargoai
Sep 29, 2020

Hello, guys.

First, I'd like to congratulate you on the 0.9.0 release. I really liked the discard() mechanism and now I'm able to validate using Pydantic easily.

I'd like to propose a new feature, using this new API, that is to early return a value. Right now, I have a cache implemented using Redis. But since I don't have a way to return the cached value early, if I'm processing a micro-batch and some of the requests have cached responses, I can avoid processing those requests again, but the whole micro-batch needs to be processed first.

So, my idea is that, similar to the current task.discard(), we would have a task.return() where we return a value. And this value would be returned early, instead of waiting for the whole micro-batch to be processed.

What do you guys think?

parano · 2020-09-30T20:29:07Z

parano
Sep 30, 2020
Maintainer

Hi @fernandocamargoti - InferenceTask#discard is also my favorite update in 0.9.0, shout out to @bojiang for designing this!

I think that is a great idea! It is a bit challenging to implement this with the current architecture since every batch's input and output are transferred between the frontend batching layer and model backend in one HTTP request. But @bojiang and I have been discussing moving this to a task queue oriented implementation. I think it will be very straightforward to add the return method as you suggested, after that refactoring.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BentoML

Support for caching #1142

{{title}}

Replies: 1 comment

{{title}}

Select a reply

BentoML

Support for caching #1142

fernandocamargoai Sep 29, 2020

Replies: 1 comment

parano Sep 30, 2020 Maintainer

fernandocamargoai
Sep 29, 2020

parano
Sep 30, 2020
Maintainer