-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of Java agent #134
base: develop
Are you sure you want to change the base?
Conversation
public static String from(String signature) { | ||
return Murmur.from(signature); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the hash algorithm is a big task that requires migrating DB data.
It's better to keep md5
as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I forgot to take backwards compatibility into consideration
How would you feel if I made it a command line option instead (retaining MD5 as default)? I can extract this change into a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it seems like a good idea to default to md5 and add it as an option for the user to configure.
cc. @kojandy @sohyun-ku
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, thank you for contributing such a great PR.
An agent is characterized by the fact that it uses the client's resources.
If the scavenger agent is using more of the user's resources (CPU, memory, etc..) than before, I think it's a bigger issue than performance improvement.
Is it possible to measure this as well?
(It doesn't seem to be an issue from the looks of it).
cc. @sohyun-ku @kojandy
I'll see what I can do :) |
Hi @sohyun-ku, @taeyeon-Kim and others 👋🏾
I've been working on some performance optimisations for the Java agent. I'd be keen to get your thoughts on these, and if you think they are worthwhile.
The optimisations in this PR consist of the following:
MethodRegistry#extractSignature
algorithm optimisationConcurrentHashMap
(rather thanLinkedBlockingQueue
) and remove worker threadI've also made the following additional changes:
InvocationTracker
andScheduler
SchedulerTest
unit tests to only testScheduler
(unblocked by 6.)These changes can mostly be reviewed commit by commit but I can also split them into multiple PRs if it would be easier to review.
Whilst developing and concluding the implementation of these optimisations, I ran some JMH benchmarks on my local machine to verify and measure the performance improvements.
Whilst I have limited experience writing and running such benchmarks, I took some care to avoid the most common pitfalls. I collected these results using Java 21 and the compiler blackhole configuration. In saying that, there are some discrepancies that I wasn't able to completely explain. So whilst I'm not completely confident in the absolute numbers, I have a reasonable level of confidence that the ordering of the results is correct.
Method hashing -
MethodRegistry#getHash
(ConcurrentHashMap
cache disabled)Invocation registering -
InvocationRegistry#register
In the following benchmarks, 'not in buffer' refers to the state of the invocations buffer immediately upon agent startup, whereas 'reset buffer' refers to the state immediately after invocation data is published and the buffer is cleared.
*This was my initial alternative implementation (44c2418), I ended up settling on a slightly different approach which offers better 'reset buffer' performance and is arguably simpler.
Overall -
InvocationTracker#onInvocation
In a real world scenario, the vast majority of invocations fall into the 'hash cached, in buffer' scenario, where the latency is minimal irrespective of the optimisations. However, the latency spikes on application start up and momentarily after every publication. Also notably, the worker thread is eliminated, leading to further indirect performance gains. In complex web applications with many tracked methods, there can be 10,000+ tracked invocations per served request. This can sum up to a delay in the order of milliseconds and thus these optimisations should help measurably reduce the worst-case performance of the agent.