Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to respect RPATHS of calling dlopen modules with dlinfo #549

Merged
merged 1 commit into from
Sep 9, 2024

Conversation

pablogsal
Copy link
Member

This commit is just for backup on our main strategy of using dlinfo
instead of dlopen.

Issue number of the reported bug or feature request: #

Describe your changes
A clear and concise description of the changes you have made.

Testing performed
Describe the testing you have performed to ensure that the bug has been addressed, or that the new feature works as planned.

Additional context
Add any other context about your contribution here.

@codecov-commenter
Copy link

codecov-commenter commented Feb 13, 2024

Codecov Report

Attention: Patch coverage is 92.30769% with 3 lines in your changes missing coverage. Please review.

Project coverage is 92.89%. Comparing base (5d9e04d) to head (7e8b4b1).

Files with missing lines Patch % Lines
src/memray/_memray/hooks.cpp 92.30% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #549      +/-   ##
==========================================
- Coverage   92.99%   92.89%   -0.10%     
==========================================
  Files          94       94              
  Lines       11445    11470      +25     
  Branches     2114     2114              
==========================================
+ Hits        10643    10655      +12     
- Misses        802      815      +13     
Flag Coverage Δ
cpp 92.89% <92.30%> (-0.10%) ⬇️
python_and_cython 92.89% <92.30%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pablogsal pablogsal force-pushed the dlopen-round-2 branch 2 times, most recently from 7fe623e to bd0a8bf Compare February 13, 2024 15:17
@godlygeek
Copy link
Contributor

We've decided to keep this option in our back pocket, but for now we're going to move forward with interposing dlsym instead of dlopen.

@godlygeek godlygeek closed this Jul 3, 2024
@pablogsal pablogsal reopened this Aug 23, 2024
@pablogsal pablogsal marked this pull request as ready for review September 2, 2024 09:29
@pablogsal pablogsal force-pushed the dlopen-round-2 branch 2 times, most recently from adeeb60 to a81955c Compare September 3, 2024 12:41
@pablogsal pablogsal requested a review from godlygeek September 3, 2024 15:11
src/memray/_memray/hooks.cpp Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
src/memray/_memray/hooks.cpp Outdated Show resolved Hide resolved
@godlygeek
Copy link
Contributor

One downside I just noticed for this approach is that we're screwing up the contract for dlerror(): we're making a bunch of dlopen and dlsym and dlclose calls that the application doesn't know about, which could result in the return value from dlerror being clobbered by one of our internal attempts, rather than the last attempt the user made.

I'm not sure that's worth trying to work around, but it's worth noting, at least...

@pablogsal
Copy link
Member Author

One downside I just noticed for this approach is that we're screwing up the contract for dlerror(): we're making a bunch of dlopen and dlsym and dlclose calls that the application doesn't know about, which could result in the return value from dlerror being clobbered by one of our internal attempts, rather than the last attempt the user made.

We can clear the possible error before calling the real dlopen if you want by calling dlerror() ourselves as this is documented to clean existing errors when called: https://github.com/bminor/glibc/blob/96d0bf98cafd0b63721f369ca21ec64590551d47/dlfcn/dlerror.c#L120

@pablogsal
Copy link
Member Author

pablogsal commented Sep 4, 2024

@godlygeek Btw, I made some experiments regarding the null values of paths->dls_serpath[i].dls_name. Looks like empty entries are already resolved to ".":

LD_LIBRARY_PATH=":a:b" python -m pytest tests/integration/test_extensions.py -v -s -k test_dlopen_with_rpath
...
...
Num paths 8
Checking .
Checking a
Checking b
Checking /tmp/pytest-of-root/pytest-23/test_dlopen_with_rpath0/sharedlibs/sharedlibs
Checking /lib/aarch64-linux-gnu
Checking /usr/lib/aarch64-linux-gnu
Checking /lib
Checking /usr/lib

same if you do -Wl,-rpath,:a:$ORIGIN/sharedlibs:

Num paths 7
Checking .
Checking a
Checking /tmp/pytest-of-root/pytest-24/test_dlopen_with_rpath0/sharedlibs/sharedlibs
Checking /lib/aarch64-linux-gnu
Checking /usr/lib/aarch64-linux-gnu
Checking /lib
Checking /usr/lib

maybe we should bail out like I did before on nullptr.

@godlygeek
Copy link
Contributor

godlygeek commented Sep 4, 2024

maybe we should bail out like I did before on nullptr.

I don't think it's possible to get paths->dls_serpath[i].dls_name == nullptr, so I don't think it matters too much what we do, beyond that we shouldn't crash if it somehow happens. I'd suggest we just continue and ignore that entry, but treating it the same as "" seems fine too.

Checking the glibc code, it looks like it never assigns a null pointer to dls_name, and the docs don't suggest that it ever could. In fact (in retrospect unsurprisingly, considering there's no call to free this structure and dlclose could be called asynchronously from another thread) every dls_name is assigned to some pointer into the paths buffer. That is, RTLD_DI_SERINFOSIZE sets dls_size to a size that's big enough not only for an array of dls_cnt entries of type dls_serpath, but also for dls_cnt paths following the dls_serpath array, all concatenated together with '\0' bytes between them, and paths->dls_serpath[i].dls_name is set to the i'th string in that buffer.

@pablogsal
Copy link
Member Author

I don't think it's possible to get paths->dls_serpath[i].dls_name == nullptr, so I don't think it matters too much what we do, beyond that we shouldn't crash if it somehow happens. I'd suggest we just continue and ignore that entry, but treating it the same as "" seems fine too.

I pushed a fixup for that

This commit is just for backup on our main strategy of using dlinfo
instead of dlopen.

Signed-off-by: Pablo Galindo <[email protected]>
Copy link
Contributor

@godlygeek godlygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've squashed the fixup and slightly reworded the news entry.

@pablogsal pablogsal merged commit 8a95e51 into bloomberg:main Sep 9, 2024
18 checks passed
@pablogsal pablogsal deleted the dlopen-round-2 branch September 9, 2024 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants