-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SharedCache] Fix slide info parsing #6172
base: dev
Are you sure you want to change the base?
Conversation
The issue this commit fixes was causing `SharedCache::ParseAndApplySlideInfoForFile` to completely fail to work with v5 slide info, which had a lot of knock on effects, i.e. lots of Objective-C analysis was failing due to invalid pointers which hadn't been fixed up. `dyld_cache_slide_info5` has 4 bytes of padding before `value_add`. Whilst `value_add` is not actually being read from, `SharedCache::ParseAndApplySlideInfoForFile` will read at a location in the file based on the size of the structure `dyld_cache_slide_info5`. This being off by 4 bytes basically broke v5 slide info fixups. With this fix many more Objective-C functions have names and a lot more `msgSend` calls are fixed up.
Prior to this commit the function `DSCObjCProcessor::PostProcessObjCSections` never does anything because it doesn't use the correct names to get the Objective-C sections of the recently loaded library. In fact it never does anything because the DSC never has sections with the names its searching for. This commit passes the `baseName` (the name of the library that was loaded), which is what other Objective-C section processing code does. Combining the base name with the section names it will now find them and process them as intended. This was resulting in alot of Objective-C related stuff being missed. There is however still an issue of the fact that the way this DSC plugin works means it only analyzes Objective-C sections once. This catches alot of things but there are a number of cases where other libraries need to be loaded first due to information being referenced in another library. For instance errors like `Failed to determine base classname for category` can be caused by the class reference in the category being to a class outside of the loaded library. Once the library containing the class has been loaded, the section containing the category should be re-proccessed.
I forgot there was another issue with Objective-C processing that meant a bunch was being skipped. This is fixed by the 2nd commit. Something which I didn't fully touch on in the commit is the issue of processing Objective-C sections which may require data from another image that is not yet loaded. I think the naive but simplistic approach would be to go through all Objective-C sections after loading an image and re-processing all of them. I think this would solve the problem but it would be quite costly, particularly if you're loading alot of sections. In that case it would make sense to have the ability to batch load images and then defer Objective-C processing until the end. Additionally this may mess with changes that the user has made, although I think its unlikely the user would be messing about with Objective-C sections unless they're using a script, in which case they could re-run their script. What probably makes the most sense in the long run is to keep track of any bits of Objective-C data that failed analysis and then re-analyze those bits after loading of images. |
I'm not 100% sure this is enough/or correct, I still had issues with v5 slides at least. I've gone and reworked the slide parsing code and think I've got it fixed. The way "starts" were pulled out into a vector and then processed below in another loop seemed to be making things difficult and disjointed for no visible benefit. Instead I'm processing "starts" in the loops where they are read and largely mimicking exactly what |
There seem to be a number of issues with slide info parsing. I decided the simplest fix was to largely mimick what `dyld` is doing. Part of this process was to remove creating a vector of page starts and processing them in another loop below the one where they were read out. It wasn't clear to me the original design decision to separate it into 2 loops like that. I think this was part of the problem that was causing issues. By adding rewrites in the same loop where page starts are being read out, it was much easier to mimick the code in `dyld` which I assume has to be correct. So as long as my copying was correct then I believe this should work as intended. Not everything has been thoroughly tested but I'm pretty confident v3 and v5 are now working as intended. v2 should be but less testing of it has been done.
The latest commit re-writes some of the slide parsing stuff. More details in the commit message but I think at this point it might be working. Not everything has been thoroughly tested but I'm pretty confident v3 and v5 are now working as intended. v2 should be but less testing of it has been done. |
I did some testing against the macOS 15.1 dyld shared cache I was seeing problems with yesterday. I've confirmed the slide handling exactly matches what dyld is doing, so this change looks good. Sadly I'm still seeing bogus Obj-C selector names. I'm guessing that's a result of Obj-C data being processed that wasn't before and that there's a pre-existing bug in parsing Obj-C metadata that wasn't visible due to the data being skipped. I'll look into that separately. |
Does this happen to be a result of this comment on issue 6173? |
This reverts commit 1539f3d.
I've removed the fixes for the Objective-C code because it didn't make sense to be part of this pull request. I'll re-add it in another one which is focussed on some Objective-C changes |
The issue this commit fixes was causing
SharedCache::ParseAndApplySlideInfoForFile
to completely fail to work with v5 slide info, which had a lot of knock on effects, i.e. lots of Objective-C analysis was failing due to invalid pointers which hadn't been fixed up.dyld_cache_slide_info5
has 4 bytes of padding beforevalue_add
. Whilstvalue_add
is not actually being read from,SharedCache::ParseAndApplySlideInfoForFile
will read at a location in the file based on the size of the structuredyld_cache_slide_info5
. This being off by 4 bytes basically broke v5 slide info fixups.With this fix many more Objective-C functions have names and a lot more
msgSend
calls are fixed up.