Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Reduce allocations #20

Open
brancz opened this issue Sep 20, 2023 · 4 comments
Open

perf: Reduce allocations #20

brancz opened this issue Sep 20, 2023 · 4 comments

Comments

@brancz
Copy link

brancz commented Sep 20, 2023

We have some code that uses this library, and we noticed a fairly significant amount of allocations being done (~30% overall and up to 70% in C++ demangling). Would you be up for adding the ability to allow reusing objects and slices? I do think it would require some API changes, or at the very least some new APIs.

Here's profiling data with the stacks up until the usage of this library anonymized: https://pprof.me/892d0e0/

@ianlancetaylor
Copy link
Owner

The approach taken for C++ demangling is definitely allocation heavy. It converts the name into an AST, and then generates the demangled name using the AST. This is done using a separate allocation for each AST node. So, yes, it adds up. It's also not trivial to fix, because each AST node is different. What is your overall use case?

That said, sure, I don't object to reusing objects and slices. Did you have a specific approach in mind?

@brancz
Copy link
Author

brancz commented Sep 21, 2023

What is your overall use case?

We use this library during symbolization in the Parca continuous profiling project. Demangling happens at query time (the idea being that the minimum demangling is used by default but more complex type parameters etc. can be requested on demand). Being in the query path we try to optimize everything as much as possible so query times are low.

Did you have a specific approach in mind?

I haven't looked at it deeper yet other than glancing over the profiling data above. My first thought just went to pooling, but I'd be happy to try any ideas you might have!

@ianlancetaylor
Copy link
Owner

The problem with pooling with the current design is that the AST is composed of many different kinds of nodes, currently I think 62 nodes. So we would potentially need 62 pools. That said I'm sure some node types are more common than others. It might be interesting to gather some stats on that.

@Zxilly
Copy link

Zxilly commented Feb 28, 2024

An easier implement can be using string intern. For eaxmple, with go4.org/intern, after applying this the same string will only need to be allocated one time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants