Disable all caching #1206

banesullivan · 2023-06-14T05:14:51Z

There are use cases for local computation where caching isn't desired. Caching is excellent for stateless environments where multiple users may be requesting tiles/thumbnails of the same image repeatedly or even in local environments when dealing with non-pre-tiled datasets. However, caching introduces unexpected results and stale data in local, stateful environments (like me as a single user in a Jupyter Notebook).

For example, I want to experiment with an algorithm to interactively to produce a new raster and call it "ndvi.tif". I want to toy with the algorithm: tinker with the computation, re-run, overwriting the existing data, and visualize the result with large_image to see how my changes to the algorithm affected the result.

This scenario wouldn't be a problem if I set up some temporary file mechanism to save a new temp file for every computation, and in effect have large_image open a new file on each iteration. But I don't want to do this. I don't want to save tons of new files. I want one working file.

The trouble is, large_image caches the tile source, and this workflow isn't possible.

To simplify the testing of this, I have two versions of the same image, ndvi-09.tif and ndvi-11.tif, which I will save as ndvi.tif and attempt to reload with large_image

`ndvi-09.tif`	`ndvi-11.tif`

The solution given in #985 isn't sufficient as there is still some caching going on that prevents large-image from re-opening the same path as a new tile source

import large_image
large_image.config.setConfig("cache_tilesource_maximum", 1)
large_image.config.setConfig("cache_python_memory_portion", 1_000_000_000)

!rm ndvi.tif
!cp ndvi-09.tif ndvi.tif

large_image.open('ndvi.tif')

!rm ndvi.tif
!cp ndvi-11.tif ndvi.tif

large_image.open('ndvi.tif')

Uh oh! That thumbnail above isn't right! It's using the cached tile source even though restricted the cache constraints.

The text was updated successfully, but these errors were encountered:

banesullivan · 2023-06-14T05:59:47Z

I'm wondering if it would be easiest to implement a "dummy cache" that always misses that users can opt in to

manthey · 2023-06-14T12:54:07Z

Here is a hacky way to do it:
Immediately after importing large_image, do large_image.cache_util.cache.LruCacheMetaclass.__call__ = lambda x, *a, **b: large_image.cache_util.cachesClear() or type.__call__(x, *a, **b)
This disables the tile source cache AND clears the tile cache whenever a new tile source is created. We probably don't want to get rid of the tile cache entirely (but I could be wrong), as then asking for a thumbnail and then tiles would be less performant. But, since reopening a file that is now different still results in the same cache keys for tiles, we need to either have a cache keys that are more dependent on file properties (or maybe use a uuid4 for the source) or flush the tile cache for that file; we don't expose any way to flush the tile cache on a per-file basis, so flushing the entire tile cache is a balance.

manthey · 2023-06-14T13:14:10Z

A better hack is probably:
large_image.cache_util.cache.LruCacheMetaclass.__setitem__ = lambda *a, **b: large_image.cache_util.cachesClear() as it doesn't break pickling sources

manthey · 2023-06-14T13:17:36Z

A better hack is probably: large_image.cache_util.cache.LruCacheMetaclass.__setitem__ = lambda *a, **b: large_image.cache_util.cachesClear() as it doesn't break pickling sources

Actually, this won't work right -- we'd have to override the __setitem__ of the cache, not the cache metaclass.

banesullivan · 2023-06-14T13:20:42Z

The first approach fixes the problem for me!

manthey · 2023-06-14T13:23:11Z

It will break pickling the tile source, so we still might want to define a config value to do this more correctly.

When opening a tile source, pass `noCache=True`. In this mode, the tile source can directly have its style modified (e.g., `source.style = <new value>`). This is also used when importing images into girder to avoid flushing the cache of tile sources that are in active use. This closes #1294. This closes #1145. There is a config value `cache_sources`, that, if False, makes `noCache` default to False. This closes #1206.

manthey · 2023-09-13T18:50:59Z

When #1296 is merged, this can be accomplished with a config setting. This will NOT break pickling the tile source.

manthey mentioned this issue Sep 13, 2023

Don't add large_image files to the source cache on girder import #1294

Closed

manthey mentioned this issue Sep 13, 2023

Add an option to not cache sources. #1296

Merged

manthey closed this as completed in #1296 Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable all caching #1206

Disable all caching #1206

banesullivan commented Jun 14, 2023 •

edited

Loading

banesullivan commented Jun 14, 2023 •

edited

Loading

manthey commented Jun 14, 2023

manthey commented Jun 14, 2023

manthey commented Jun 14, 2023

banesullivan commented Jun 14, 2023

manthey commented Jun 14, 2023

manthey commented Sep 13, 2023

Disable all caching #1206

Disable all caching #1206

Comments

banesullivan commented Jun 14, 2023 • edited Loading

banesullivan commented Jun 14, 2023 • edited Loading

manthey commented Jun 14, 2023

manthey commented Jun 14, 2023

manthey commented Jun 14, 2023

banesullivan commented Jun 14, 2023

manthey commented Jun 14, 2023

manthey commented Sep 13, 2023

banesullivan commented Jun 14, 2023 •

edited

Loading

banesullivan commented Jun 14, 2023 •

edited

Loading