Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client side caching may be overriding cache invalidation #67

Open
YPCrumble opened this issue Mar 7, 2022 · 5 comments
Open

Client side caching may be overriding cache invalidation #67

YPCrumble opened this issue Mar 7, 2022 · 5 comments

Comments

@YPCrumble
Copy link
Contributor

I haven't been able to fully research this as it's quite complex and I'm not an expert on caching headers, but plan to in the future and want to write it down to make sure I'm not on a completely wrong path.

What I'm experiencing is that browsers that are accessing pages that are cached and invalidated using django-fancy-cache are still pulling a resource (from disk cache) on chrome. Hard refresh of the browser solves the issue which makes me believe this is an issue with browser caching results to disk.

Question - should items cached and invalided by fancy-cache include a no-cache; no-store header? Otherwise is there a good way to make sure that the browser doesn't pull a stale response from its local cache?

I've looked into Etag and I suspect that might be a good alternative; as far as I can tell this would allow cache invalidation based on the hashed value of the response.

@YPCrumble
Copy link
Contributor Author

An idea I'm playing with is to adjust the cache_page decorator like so to include django's etag functionality:

def fancy_etag_func(request):
    """
    Helper function to set the value of the Etag header to the timestamp
    of fancy-cache's rememebered URL (if available).

    Otherwise sets the etag value to a random string
    """
    remembered_urls = cache.get(REMEMBERED_URLS_KEY, {})
    url = request.get_full_path()

    remembered_url = remembered_urls[url]

    if remembered_url is None:
        # URL has been re-cached or purged; return a new response.
        return str(hash(time.time()))

    # URL has not been modified or purged; return the hash of the timestamp
    # which (if it matches match the client's Etag header) will allow
    # the client to re-use its cached value.
    return str(hash(remembered_url[1]))


def cache_page(timeout: float, *args, **kwargs) -> typing.Callable:
    cache_alias = kwargs.pop("cache", None)
    dec = decorator_from_middleware_with_args(FancyCacheMiddleware)(
        timeout, *args, cache_alias=cache_alias, **kwargs
    )
    if settings.FANCY_INCLUDE_FANCY_ETAG:
        return etag(fancy_etag_func)(dec)
    return dec

The biggest problem here is that etag doesn't work for Django Rest Framework - so perhaps this would need another setting like FANCY_USE_REST_FRAMEWORK_ETAG from a repo like this one: https://github.com/jozo/django-rest-framework-condition

A second option would be to change return etag(fancy_etag_func)(dec) to return never_cache(dec) so that it simply prevents the client from caching the response.

@YPCrumble
Copy link
Contributor Author

@peterbe thoughts on the FANCY_USE_REST_FRAMEWORK_ETAG setting I mentioned above? My current solution is to use never_cache on all endpoints I'm using fancy-cache for...and I suppose incorporating etag would reduce the number of times that clients are asking for the cached response itself.

See https://docs.djangoproject.com/en/4.0/topics/conditional-view-processing/ for discussion of the Etag header.

@peterbe
Copy link
Owner

peterbe commented May 5, 2022

Question - should items cached and invalided by fancy-cache include a no-cache; no-store header? Otherwise is there a good way to make sure that the browser doesn't pull a stale response from its local cache?

No, I don't think this library should make such connections.

Client-side caching is very different from server-side caching. It's not inconceivable that you want the server to cache but the client to not cache it.

Pardon my ignorance, but aren't the cache-control related decorators in stock Django good enough to do things like:

from fancy_cache import cache_page
from django.views.decorators.cache import never_cache

@never_cache
@cache_page(60 * 60)
def myview(request):
    return render(request, 'page1.html')

That would force the client to keep coming back for the latest and freshest but at the same time the server would be able to process that request fast since it's cached.

@YPCrumble
Copy link
Contributor Author

@peterbe makes sense - and this is exactly what I'm currently doing everywhere I'm using fancy cache's cache_page decorator.

What I was considering was whether A/ many users would want this functionality built-in with a setting, which makes sure they don't forget to add the never_cache decorator, or B/ whether it would be possible to use an etag function that would further improve response times from the API, in that the cached response wouldn't need to be sent if the entry in the fancy_urls dict were found with the same etag value.

This seems like a small optimization though so I'm going to close this.

@YPCrumble
Copy link
Contributor Author

YPCrumble commented Jun 22, 2022

@peterbe reopening because I just realized that the latest version of our middleware, which follows the Django cache middleware, does not cache the response if the Cache-Control: private header is set.

the @never_cache decorator began adding this header in Django 3.2, so simply using the @never_cache decorator out of the box doesn't work. For now I'm just going to make my own version of the header to remove the private header, but I could still see a use case for adding a setting like FANCY_ADD_NEVER_CACHE_HEADERS that would skip this check, like so:

if settings.FANCY_ADD_NEVER_CACHE_HEADERS is False:
    # Don't cache a response with 'Cache-Control: private'
    if "private" in response.get("Cache-Control", ()):
        return response

...and this would also add the never_cache headers elsewhere in the middleware so I don't need to make my special version of never_cache.

A second alternative is I could add my modified version of @never_cache to this codebase as @fancy_never_cache and document what I'm doing. EDIT: Actually I would prefer to keep the private header as done in Django so this doesn't really make sense.

None of this is absolutely necessary but thought to put this here in case anyone else encounters the issue in the future.

@YPCrumble YPCrumble reopened this Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants