-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InflateState::new_boxed() uses large stack allocation #155
Comments
Yeah as far as I know rust still lacks a way to directly heap allocate a struct that is guaranteed to avoid the stack even in debug mode without resorting to unsafe (but feel free to correct me if I'm wrong.) I think it may be possible to allocate the array buffers separately and keep the static length information now that const generics are in the language though I don't know if that's optimal performance wise. The reason for using default in that function was that in the past it used to used the former box keyword internally which did a direct heap allocation but that was removed a long time ago. Back then the compiler also struggled to optimize out allocation in release mode so using If the compiler can optimize out the stack allocation on one platform it very likely can on other platforms too though, as LLVM does much of the optimization on LLVM intermediate representation that will be mostly the same between platforms (at least with portable code like this that doesn't interface with any system stuff), before it's finally converted down to machine code. It should also be doable to reduce the size of the compressor and decompressor structs a tiny bit as right the huffman tables in the structs used for the distances and huffman length codes are way larger than they need to be for code simplicity, something inherited from the original miniz. |
One could add |
I noticed a comment in the
InflateState::new()
function:The size of the struct is about 43KB. Probably not a huge issue for Windows or Linux desktops where you have 1MB+ of stack space but maybe an issue for embedded platforms.
The solution seems to be to use the
new_boxed()
function. However, this results in a temporary stack allocation of the entire size of theInflateState
struct regardless. The function callsBox::default()
to do the heap allocation which has a definition ofThis first constructs
InflateState
on the stack and then moves it to the heap. I was able to confirm this by usingrust-gdb
to examine the assembly code. I will note that a release build on x86_64 Linux appears to optimize this to first allocate the memory and then initialize directly on the heap but a debug build does not. Other platforms also may not have this optimization.The only solutions I can think of off the top of my head is unsafe code doing manual allocation or moving some of the larger fixed size arrays to
Vec
. Neither is really ideal since it seems to be a goal of this project to not use unsafe andVec
would result in some additional overhead. Still figured I would raise this issue in case someone has a better idea.The text was updated successfully, but these errors were encountered: