perf: Reduce allocations #20

brancz · 2023-09-20T12:08:28Z

We have some code that uses this library, and we noticed a fairly significant amount of allocations being done (~30% overall and up to 70% in C++ demangling). Would you be up for adding the ability to allow reusing objects and slices? I do think it would require some API changes, or at the very least some new APIs.

Here's profiling data with the stacks up until the usage of this library anonymized: https://pprof.me/892d0e0/

ianlancetaylor · 2023-09-20T19:27:11Z

The approach taken for C++ demangling is definitely allocation heavy. It converts the name into an AST, and then generates the demangled name using the AST. This is done using a separate allocation for each AST node. So, yes, it adds up. It's also not trivial to fix, because each AST node is different. What is your overall use case?

That said, sure, I don't object to reusing objects and slices. Did you have a specific approach in mind?

brancz · 2023-09-21T19:15:21Z

What is your overall use case?

We use this library during symbolization in the Parca continuous profiling project. Demangling happens at query time (the idea being that the minimum demangling is used by default but more complex type parameters etc. can be requested on demand). Being in the query path we try to optimize everything as much as possible so query times are low.

Did you have a specific approach in mind?

I haven't looked at it deeper yet other than glancing over the profiling data above. My first thought just went to pooling, but I'd be happy to try any ideas you might have!

ianlancetaylor · 2023-09-21T21:35:58Z

The problem with pooling with the current design is that the AST is composed of many different kinds of nodes, currently I think 62 nodes. So we would potentially need 62 pools. That said I'm sure some node types are more common than others. It might be interesting to gather some stats on that.

Zxilly · 2024-02-28T04:50:32Z

An easier implement can be using string intern. For eaxmple, with go4.org/intern, after applying this the same string will only need to be allocated one time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Reduce allocations #20

perf: Reduce allocations #20

brancz commented Sep 20, 2023 •

edited

Loading

ianlancetaylor commented Sep 20, 2023

brancz commented Sep 21, 2023

ianlancetaylor commented Sep 21, 2023

Zxilly commented Feb 28, 2024 •

edited

Loading

perf: Reduce allocations #20

perf: Reduce allocations #20

Comments

brancz commented Sep 20, 2023 • edited Loading

ianlancetaylor commented Sep 20, 2023

brancz commented Sep 21, 2023

ianlancetaylor commented Sep 21, 2023

Zxilly commented Feb 28, 2024 • edited Loading

brancz commented Sep 20, 2023 •

edited

Loading

Zxilly commented Feb 28, 2024 •

edited

Loading