Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models created with different scikit-learn versions break compatibility #1195

Open
mireianievas opened this issue Dec 15, 2023 · 8 comments
Open
Labels
enhancement New feature or request invalid This doesn't seem right

Comments

@mireianievas
Copy link
Collaborator

Can anyone cross-check if generating a clean mamba create installs scikit-learn=1.3.2. In my case it seems to be doing that (despite seeing that in the environment in the main version it is forced 1.2). It results in a non-working lstchain environment

Steps:

mamba create -c conda-forge -n lstchain-v0.10.5 python=3.11 lstchain=0.10.5

Errors I was getting using that version of sklearn through the dl1_to_dl2

/home/mireia.nievas/workspace/software/miniconda/envs/lstchain-v0.10.5/lib/python3.11/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator DecisionTreeRegressor from version 1.2.2 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
  warnings.warn(
Traceback (most recent call last):
  File "/home/mireia.nievas/workspace/software/miniconda/envs/lstchain-v0.10.5/bin/lstchain_dl1_to_dl2", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/mireia.nievas/workspace/software/miniconda/envs/lstchain-v0.10.5/lib/python3.11/site-packages/lstchain/scripts/lstchain_dl1_to_dl2.py", line 270, in main
    models_dict = {'reg_energy': joblib.load(Path(args.path_models, 'reg_energy.sav'))}
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(  ...  )

ValueError: node array from the pickle has an incompatible dtype:
- expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
- got     : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]
@vuillaut
Copy link
Member

@mireianievas
Copy link
Collaborator Author

Ok, strange.

@morcuended
Copy link
Member

mamba create -c conda-forge -n lstchain-v0.10.5 python=3.11 lstchain=0.10.5

is actually giving me problems in both linux and macos:

Could not solve for environment specs
The following package could not be installed
└─ lstchain 0.10.5**  is uninstallable because it requires
   └─ python-blosc2  , which does not exist (perhaps a missing channel).

@maxnoe
Copy link
Member

maxnoe commented Dec 18, 2023

The package definitely exists:
https://anaconda.org/conda-forge/python-blosc2/files?version=2.3.2

@vuillaut
Copy link
Member

docker run -it mambaorg/micromamba
micromamba create -c conda-forge -n lstchain-v0.10.5 python=3.11 lstchain=0.10.5
micromamba activate lstchain-v0.10.5
python -c 'import lstchain; print(lstchain.__version__)'
0.10.5

works fine

@mireianievas
Copy link
Collaborator Author

ok, lets close then.

@morcuended
Copy link
Member

The package definitely exists:
https://anaconda.org/conda-forge/python-blosc2/files?version=2.3.2

Just a correction on this. It actually worked fine on linux, but not on macos (my bad!)

@morcuended morcuended reopened this Nov 20, 2024
@morcuended
Copy link
Member

The problem is still there.

setup.py requires: 'scikit-learn~=1.2' which means the latest available among 1.x not 1.2.x, while env yml file installs scikit-learn=1.2 (1.2.2 more precisely). The CI uses the env yml to setup the environment. Later it does pip install ., but since it finds the already install 1.2.2 it does not anything else.

Models created with scikit-learn 1.2.2 cannot be read with newer versions:

In dl1 to dl2 command:

/fefs/aswg/software/conda/envs/foa/lib/python3.9/site-packages/sklearn/base.py:376: InconsistentVersionWarning: Trying to unpickle estimator DecisionTreeRegressor from version 1.2.2 when using version 1.5.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations

...

ValueError: node array from the pickle has an incompatible dtype:
- expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
- got     : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]

So I think we should fix the sklearn version in setup.py

@morcuended morcuended added enhancement New feature or request invalid This doesn't seem right labels Nov 20, 2024
@morcuended morcuended changed the title 'mamba create' generates an environment with scikit-learn=1.3.2, which breaks compatibility Models created with different scikit-learn versions break compatibility Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

4 participants