Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force loading tensorflow shared libraries to prevent segfault #290

Merged
merged 4 commits into from
Feb 9, 2021

Conversation

diegoferigo
Copy link
Collaborator

@diegoferigo diegoferigo commented Feb 9, 2021

The segfault is a know problem and it is already documented in the FAQs:

https://github.com/robotology/gym-ignition/blob/321df6a2f93da829682ead2f428022ec1c4ef416/docs/sphinx/info/faq.rst#L4-L18

This PR is an attempt to prevent the occurrence, similar to apache/arrow#2210. Additional information can be found in #279 (comment).

Note that loading only the shared libraries is much faster than loading the entire tensorflow package.

Copy link
Contributor

@traversaro traversaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for me. Do you know if there is any python way for permit users to opt out/disable this workaround?

@diegoferigo
Copy link
Collaborator Author

I think we can use an ad-hoc environment variable, such as SCENARIO_DISABLE_TENSORFLOW_PRELOAD. It's a good idea having it in place.

@diegoferigo
Copy link
Collaborator Author

diegoferigo commented Feb 9, 2021

Ouch, something's wrong in CI with idyntree and numpy:

RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/idyntree/__init__.py", line 1, in <module>
    from . import swig
  File "/usr/local/lib/python3.8/dist-packages/idyntree/swig.py", line 13, in <module>
    from . import _iDynTree
ImportError: numpy.core.multiarray failed to import

Edit: I have the feeling that iDyntree is compiled against a new version of numpy is its isolated environment (that is clean and temporary), but outside numpy is older. This can be caused by some of the dependencies of the packages installed in the active Python environment having a <= constraint of numpy.

Similar issues freqtrade/freqtrade#4281, https://github.com/cvxgrp/cvxpy/issues/1229.

I think that we can either change idyntree to have a <1.20, or gym-ignition to have >=1.20. I'd tend towards the latter.
The problem seems instead to be a mismatch of the numpy version when generating the idyntree wheel from its sdist. Before doing this, in CI the package python3-numpy (old version) was installed through apt. The wheel however is built against a recent version on numpy installed in the temporary isolated environment. Then, when the wheel is installed in the system, numpy is not pulled from PyPI because there's already the system version. Solved in 21becc5.

@diegoferigo diegoferigo force-pushed the fix/prevent_tensorflow_segfault branch from 3317133 to e3ea127 Compare February 9, 2021 14:57
@diegoferigo diegoferigo force-pushed the fix/prevent_tensorflow_segfault branch from e3ea127 to 21becc5 Compare February 9, 2021 15:33
@diegoferigo
Copy link
Collaborator Author

diegoferigo commented Feb 9, 2021

CI is still failing, this time in the IK test.

The CI is building the wheel from the devel branch of idyntree. The recent robotology/idyntree#822 (cc @GiulioRomualdi) introduced a new variant of KinDynComputations::setCurrentRobotConfiguration, and in this context SWIG removes the keyword arguments from the bindings.

cba57f6 should solve the problem.

@diegoferigo diegoferigo merged commit 33cb234 into devel Feb 9, 2021
@diegoferigo diegoferigo deleted the fix/prevent_tensorflow_segfault branch February 9, 2021 17:24
@GiulioRomualdi
Copy link

CI is still failing, this time in the IK test.

The CI is building the wheel from the devel branch of idyntree. The recent robotology/idyntree#822 (cc @GiulioRomualdi) introduced a new variant of KinDynComputations::setCurrentRobotConfiguration, and in this context SWIG removes the keyword arguments from the bindings.

cba57f6 should solve the problem.

I'm sorry 😔

@diegoferigo
Copy link
Collaborator Author

@GiulioRomualdi Using kwargs is not mandatory, but it's useful to provide more context that often would require referring to the documentation. The price to pay is a more tight alignment with upstream that occasionally triggers minor problems like this one. No problem at all ;) Actually the span-based methods could be quite useful also to Python, but I fear that for exploiting them the most, pybind11 is necessary (and here we're based on SWIG).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants