redis fsspec cache #1498
Replies: 2 comments 8 replies
-
That's great, thank you for sharing. You would be welcome to propose this for inclusion in fsspec or as a separate package in the fsspec org (or not), whatever you feel right. There has been long standing intent to make file/block caching in fsspec better and more flexible, and this show a nice example of the possibilities. It is a pity, for example, that you needed to write both a filesystem interface and the cacher class too (although I suppose it's possible to use only the latter). Actually, a general-purpose redisFS would be pretty useful too. I could not immediately tell, is the implementation here supposed to allow for async or batch requests?
Isn't this surprising? While the work here may be very useful for lots of people, I would expect xpublish, which is all REST API, to work well for async IO. |
Beta Was this translation helpful? Give feedback.
-
I would like this to work for async tasks but I have not tested as such. The cache methods are definitely sync and i couldnt tell if there was a way to make the cache impl methods async? Is that possible? Certainly the redis ios should be non-blocking if possible.
I think I can explain this better... Xpublish itself supports building async routes without issue, but Xarray itself does not support async methods except for when a distributed dask array is being used. I am not using dask distrbuted, so all the xarray functionality behind xpublish is sync. This means that all xarray functionality is blocking so this cache allows for having a whole bunch of blocking workers with a shared cache instead of only a few non blocking async workers. Does that help explain a bit? |
Beta Was this translation helpful? Give feedback.
-
Hello! I built a simple prototype fs cache using redis as a proof of concept: https://github.com/mpiannucci/redis-fsspec-cache
The reason for building this is that we are running xpublish which does not support async for io bound tasks. This means to scale out, we have to add more workers horizontally. Using redis as our fsspec cache allows us to scale up while sharing recently loaded chunks or blocks close to the compute to minimize latency.
I am not sure if there are implementation details I am missing to make this perfect but it is working well for me so I wanted to share!
Beta Was this translation helpful? Give feedback.
All reactions