-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow startup with many static files #263
Comments
Thanks, this is an interesting question. The short answer is: no, you're not using the tools incorrectly and you're not missing any config options; this is just a situation I didn't have in mind when I originally designed Whitenoise. In most deployments there's a relatively limited number of static files and they live alongside the rest of the application code which is usually fast to access. A lazy initialisation option wouldn't actually solve the fundamental problem with autorefresh mode which isn't the overhead it adds to serving static files (an extra stat isn't the end of the world here) but the overhead it adds to every other request to the application where it has to hit the disk to confirm that there isn't a corresponding static file. Because Whitenoise needs to be able to serve files from arbitrary URLs (e.g. Probably the best solution here would be to pre-generate and cache the |
Thanks for the thoughtful response @evansd. Do you think a build/deployment-time-regenerated and initialisation-time-loaded pickle file cache is something you'd want to support in the main project? We'll probably head down the path you suggest, but wouldn't bother with a full PR to the upstream project if that's not a direction you'd want to head. |
I wouldn't be totally opposed to adding support for it, but it would depend on how much complexity it adds and, specifically, how well-isolated that complexity is from everything else. This feels like one of those situations where adding basic support which works but with a few rough edges would be relatively easy, whereas getting something really solid which handles all the edge cases would be a lot more work. I'd certainly be happy to reference your use case in the documentation though and link to any code you have. I'd say, get something that works for you by subclassing the middleware and then we can decide from there whether it looks worth trying to include in the main project. |
I would love to piggyback on a build/deployment-time-regenerated and initialisation-time-loaded pickle file cache as a mode that WhiteNoise supports 👀 . |
I've put together an approach that seems to work in a Gist. I put the pickle generation logic in a new Storage class to piggy-back off The inheritance chain wasn't totally clean:
Hope you find it useful. Feel free to take what I wrote as a starting point. Feedback welcome, and let me know if there are things I can do to make it easier to incorporate into or at least work with the main project. |
Just wanted to mention that I am interested in finding a solution to this issue. I am the maintainer of Tutor, a wrapper distribution around Open edX; Open edX is a very complex Django project which relies on many static files. With Tutor we made the decision a couple releases ago to use Whitenoise for serving static assets, and we've been happy with it. The only issue is that it takes a few seconds before the gunicorn server is actually able to serve requests because of the file caching in Whitenoise -- not a deal-breaker, but we could definitely use a build-time cache generation step when building our Docker images. |
I am also hitting a similar issue as described by the OP using EFS with lots of static files |
@thenewguy We aren't (yet) using my suggested code in production, just test environments, though it is still our intention to use it in production. Our experience has been that the resource overhead of setting |
I looked at this again and had an idea that I think solves all the points with very little code written against the Django middleware:
-- edit -- I've started an implementation here: master...thenewguy:lazy Some simple static serving tests in my local project work. Will continue to tinker. I intend to use the storage manifest to determining the return value from @evansd Any thoughts on this approach? -- edit 2 -- I've started PR #275. Awaiting feedback to write tests and docs |
Just wanted to leave an update that I've been running with the working concept in PR #275 for a little bit now and it seems to work well |
If anyone wants to PR ServeStatic, I would support merging any PR that adds a management command to pre-populate the known files via the Django manifest. EDIT: Ended up PRing this myself Archmonger/ServeStatic#22 |
When multiple Django WSGI processes start in parallel accessing thousands of static files via network file system, it can take quite a while to seed the
WhiteNoise.files
static files cache, during which time the Django app can't yet service inbound requests. (For example, we have server instances with 6 Django processes running Whitenoise 4.1.3 accessing ~4,000 static files via EFS (AWS-flavored NFS) which takes roughly 45 seconds to finish scanning the staticfiles dirs and begin accepting requests.)I see the
WHITENOISE_AUTOREFRESH
setting to skip the cache initialization, but that's labeled not for production use, and clearly adds ongoing overhead, as it stats every file at request time.I wonder if there's another configuration option we're missing, if we're using the tools incorrectly, or if you've considered adding another mode, where the
WhiteNoise.files
cache is lazy loaded incrementally as each file is requested?The text was updated successfully, but these errors were encountered: