Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: increase efficiency by using coroutines #322

Draft
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

Axelen123
Copy link
Member

This PR contains ideas I got while looking at discussions and code related to the perf/parallel-patch-execution branch.

Changes

  • Fingerprint matching and class searches are now done in parallel
  • IO operations are now performed on IO threads
  • Changed the Patch API to use coroutines

TODOs

  • Fix resource patch data races
  • Improve API
  • Cleanup

@LisoUseInAIKyrios
Copy link
Contributor

LisoUseInAIKyrios commented Nov 10, 2024

So everything is still in order and single threaded, except when each patch is resolving a fingerprint it's that single fingerprint resolving that runs in parallel?

This should prevent all concurrency issues and the patches code won't need any changes or need to care about writing proper multi-threaded code.

What is the resource patch data race issue?

@LisoUseInAIKyrios
Copy link
Contributor

I'm trying this locally, but it looks like the patches are running in parallel?

I'm seeing the patches are not finishing in alphabetical order. If the patching is done in parallel then patching errors can occur non-deterministically (ie: at random), because multiple patches can modify the same method in an interleaving fashion. Where one patch is getting an insert index, and then another patch adds code that changes that insertion index, and the first patch then uses the wrong index which is now pointing to the wrong instruction.

I'm only seeing a minor speedup with this change.

Without this change, it takes 70 seconds total patching time from start to finish (including installing the patched app).
With this change I'm seeing 65 seconds total time.

5 seconds of improvement is less than a 10% improvement and seems too little to notice. Slower devices would see a larger speedup, but it still would not be substantial.

@LisoUseInAIKyrios
Copy link
Contributor

An unrelated idea to speed up patching, is to use KMP for the opcode pattern matching.

KMP is ideal since it changes searching to O(N) instead of the current worse case O(N^2). But it really shines when the search target has a limited set of characters and many repeating patterns, such as the repeating and common opcode patterns generated by the compiler.

That alone would most likely give even better performance than trying to parallelize patching.

@Axelen123
Copy link
Member Author

Axelen123 commented Nov 17, 2024

The patcher has a coroutine-aware lock to prevent execution of other bytecode patches until the current one has finished. Resource patches are still being executed on multiple threads, but that will be changed later. On my PC, patch execution time goes from 9,5 seconds to about 5,6. Phones will probably benefit more from this, although I have not tested it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants