-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add conda subpackages corresponding to pip extras #52490
Comments
(I am a beginner, it's my first issue) I am not sure what to do exactly |
Sorry, but I am -1 on this. I think we have way to many groups for this to be sustainable. I would rather conda actually add proper support for this (xref conda/conda#11053). I'll leave this issue open, though, if any other people want to give feedback. |
Thanks for your comment @lithomas1 , but I'm not sure that waiting for conda to implement some significant new functionality is a viable alternative, when there is a well functioning, widely adopted alternative of using conda subpackages. Besides, I don't see that proposal avoiding duplication since you need to manually map pypi packages to conda ones anyway. And there's no single source of truth for the minimum dependency versions anyway since they're also duplicated in the docs: https://github.com/pandas-dev/pandas/blob/main/doc/source/getting_started/install.rst. Like you, I also don't like the duplication between the pyproject file and the conda recipe, but much more than that I don't like having to go to the pandas docs every time I start a new project and work out the minimum versions of all of the dependencies I want and then write this in my environment.yml file. - pandas >=2.0.0
- bottleneck >=1.3.4
- numba >=0.55.2
- numexpr >=2.8.0
- pyarrow >=7.0.0
- matplotlib-base >=3.6.1 # Note this is not matplotlib which includes more optional dependencies when I could just write this instead - pandas-performance >=2.0.0
- pandas-parquet >=2.0.0
- pandas-plot >=2.0.0 Nor is ignoring conda users a good idea either. The pip extras were added in the pandas 2.0 for a very good reason: it saves a lot of people time and makes it much more user-friendly. #39164. A more valuable package manager change would be to allow |
Thanks for the feedback. I don't think my opinion has changed, though.
This doesn't solve the issue, but you can try looking through our CI env files (e.g. https://github.com/pandas-dev/pandas/blob/main/ci/deps/actions-38.yaml, they are all under the ci/deps folder). I believe, all dependencies there specify a minimum version (sans a couple).
My main gripe is that this clutters up the conda-forge channel (xref conda-forge/conda-forge.github.io#1558).
I understand that this is annoying, but I don't think it's reasonable or sustainable to ask every project with conda packages and pip extras, to hack around the issue like this. This is not a pandas-specific problem, but a conda problem, and I would like it fixed in the right place. |
FWIW, I still think that this is a worthwhile problem to solve and that conda subpackages is the best solution. But I'm not sure there's ever going to be a solution to the duplication of information problem. As I understand it, subpackages is the conda version of pip extras, rather than just being a workaround. i.e. it's no use wishing that you could write |
No, you're right, conda has no concept of extras, only separate outputs. There are efforts to change this, but even if this were to land tomorrow, we wouldn't be able to rely on this for a while yet. In practice though, matching the extras in the conda-forge feedstock is not any more work (after the initial setup) than keeping the requirements in sync with whatever's specified in |
The way I look at it, conda subpackages are the same concept as pip extras -- or at least you can implement the same concept as pip extras using conda subpackages. Just because they have different names and superficial syntax doesn't mean they aren't the same concept. That's just my opinion, but I think it matches @xhochy 's: conda/ceps#55 (comment). |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
It would be good to be able to install extras along with pandas using conda as well as pip, since 2.0.0. For example:
should be equivalent to
This matches lots of other packages, such as matplotlib, seaborn, dvc, black, etc. e.g. https://github.com/conda-forge/matplotlib-feedstock/blob/main/recipe/meta.yaml, https://dvc.org/doc/install/linux#install-with-conda, https://github.com/conda-forge/black-feedstock/blob/main/recipe/meta.yaml
Feature Description
Use subpackages in https://github.com/conda-forge/pandas-feedstock/
Alternative Solutions
Current situation:
At the start of every project using conda (or when updating the requirements), the user must find the right part of the pandas docs, read it to work out the correct minimum version of optional dependencies they need, map the pypi package names to conda ones and then add those explicitly to their environment.yml file.
Additional Context
Suggest defining both
pandas-base
(or-core
) to match pandas exactly, thenpandas
that just depends onpandas-base
but could be expanded with recommended but not mandatory dependencies, plus all of the non-development and non-complete extras from pyproject.toml.I can work on this PR for the pandas feedstock it's welcome.
update: Added alternative solution to describe what currently happens.
The text was updated successfully, but these errors were encountered: