Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake_with_sdist equivalent in scikit-build-core #392

Closed
MuellerSeb opened this issue Jun 21, 2023 · 20 comments
Closed

cmake_with_sdist equivalent in scikit-build-core #392

MuellerSeb opened this issue Jun 21, 2023 · 20 comments

Comments

@MuellerSeb
Copy link

Is there something similar to the legacy cmake_with_sdist in scikit-build-core (beside the setuptools backend)?

An if so (or if planed), is there a CMake variable to indicate we are currently building sdist?

Cheers,
Sebastian

@henryiii
Copy link
Collaborator

See #319 (which is about the backward compat option, but I'd expect a pyproject.toml equivalent to be available about the same time), also #249, which discusses the need. I think the main issue is coming up with a good design. Would love as many examples of usages of this as possible, I only have two so far (in #319).

I don't have custom variables to indicate what state scikit-build-core is in, but that's a good idea. You could also detect wheel vs. metadata preparation vs. editable install with a variable like that.

@MuellerSeb
Copy link
Author

Cool to hear. I think variables like ${SKBUILD_STATE_SDIST} would be cool to do stuff like this.
For me this (see #249) is the only blocker for switching to scikit-build-core ATM.

@henryiii
Copy link
Collaborator

henryiii commented Jun 21, 2023

Do you have a pre-switch example? I'd like to have examples of what this looks like in practice to have a good idea of what to design.

I'd probably go with ${SKBUILD_STATE} which would be sdist (when that's added), metadata_wheel, wheel, "metadata_editable", or editable.

@MuellerSeb
Copy link
Author

Here:

This is a custom sdist command with scikit-build. Would like to do that downloading in CMake.

PS: editable could then be used to infer the correct install location, right?

@henryiii
Copy link
Collaborator

henryiii commented Jun 21, 2023

FWIU, that's not actually running CMake when making the SDist, but downloading a tarball?

Ahh, yes, "Would like to do that downloading in CMake". FWIW, you can do this with git submodules today with scikit-build-core.

Okay, thanks, good to have a data point.

henryiii added a commit that referenced this issue Jun 21, 2023
@MuellerSeb
Copy link
Author

MuellerSeb commented Jun 22, 2023

git submodules is of course an option, but we are relying on CPM.cmake, so I think knowing the skbuild-state is sdist to copy the downloaded folder could be convenient.

@LecrisUT
Copy link
Collaborator

Is the issue that skbuild does not run cmake before sdist creation? Otherwise it should be possible with a combination of setting the cmake source location and sdist.include option right?

@henryiii
Copy link
Collaborator

Currently CMake does not run (and isn't even required) when making an SDist. It's only run for the wheel step. The tricky part here is figuring out what you'd do with CMake during the SDist step. You don't want the config output build directory in the SDist, but you do want the downloaded packages (which are in the build directory as output...)

Examples I see of this currently with scikit-build (classic) generally have a lot of python dedicated to copying in an overloaded sdist command.

@henryiii
Copy link
Collaborator

For CPM.cmake, maybe we could integrate with them, and their CMake files could be even distributed in a package. Maybe we should target solving that specific case first, then a more general solution might be more obvious.

@LecrisUT
Copy link
Collaborator

The download location is configurable via a cache variable FETCHCONTENT_BASE_DIR, and the download is done at configure time. Similar case is for cpm. So as long as a hook is added to run configure, it should be good to go right? Although having a sdist.artifact would also be beneficial in this case

@henryiii
Copy link
Collaborator

Hmm, we could save {build_dir}/deps in the SDist then populate the build directory with it when making the wheel. As long as it's fine to delete all of the other files & still be able to build without internet (which would be the reason to put it into the SDist). I believe that's exactly what it does if there's existing content there.

I'd also want to avoid saving the config output of the fetch content, as the environment when making the SDist is not the environment when making the wheel.

@LecrisUT
Copy link
Collaborator

Why not have the user setup dedicated folder in the source directory? That way it does not have to be relocated, and if the setting is in pyproject.toml, then it is reproducible at the wheel stage without internet connection. The folder structure might end up a bit weird, but that can be corrected by setting SOURCE_DIR in the individual package's configuration. So I think it's enough to simply implement the trigger to configure and a sdist.artifact. It shouldn't otherwise be a default to include the dependencies because of distribution licensing issues.

@henryiii
Copy link
Collaborator

It won't ever be a default since we won't ever default to running cmake when making an SDist. :) Making it too automatic, even when enabling CMake during the SDist step, might be something to keep in mind as possibly undesirable, though.

We may want to implement hatchling-style force-include first, that likely would be handy for this (as long as there was some way to point at the build directory). Actually, I wonder if force-include + running cmake in the SDist step would be enough? I guess it would need to go both directions... Okay, maybe sdist.artifact should be a dict - mapping an sdist path (as a key) to a build path (it still would share a lot of implementation with force-include).

@LecrisUT
Copy link
Collaborator

👍 on that approach, except for

Okay, maybe sdist.artifact should be a dict - mapping an sdist path (as a key) to a build path

The cmake author can and should setup the SOURCE_DIR in FetchContent_Declare and equivalent CPM_SOURCE_CACHE+CPM_USE_NAMED_CACHE_DIRECTORIES on cpm side. This way there is less potential of contamination with build files. (Although I forgot where cpm stores its build directory). This ensures that the configuration is buildable even without the explicit configurations in the python project.

Mapping it to and from the build directory is a bit non-standard and confusing as to which option is relative to build, which one is relative to source. And it avoids having to do the reverse of copying from the sdist to the build directory.

@henryiii
Copy link
Collaborator

It can't be done "without explicit configuration", since anything gitiginored will also be excluded by the SDist unless you explicitly include it in your configuration. But that sounds like a good plan, and is the simplest to implement on our side.

@MuellerSeb
Copy link
Author

The cmake author can and should setup the SOURCE_DIR in FetchContent_Declare and equivalent CPM_SOURCE_CACHE+CPM_USE_NAMED_CACHE_DIRECTORIES on cpm side. This way there is less potential of contamination with build files. (Although I forgot where cpm stores its build directory). This ensures that the configuration is buildable even without the explicit configurations in the python project.

For CPM I wouldn't hijack CPM_SOURCE_CACHE+CPM_USE_NAMED_CACHE_DIRECTORIES, since this would affect other packages as well (up/downstream).
I would use the DOWNLOAD_ONLY option in CPMAddPackage if ${SKBUILD_STATE} is sdist and then use the information of <dep>_SOURCE_DIR to copy the dependency-source to where I want it with cmake.

So for CPM this should be solved with the information ${SKBUILD_STATE} is sdist and the option to run cmake when creating the sdist.

For then using this local folder with CPM, see here: cpm-cmake/CPM.cmake#483

@LecrisUT
Copy link
Collaborator

LecrisUT commented Jun 27, 2023

For CPM I wouldn't hijack CPM_SOURCE_CACHE+CPM_USE_NAMED_CACHE_DIRECTORIES, since this would affect other packages as well (up/downstream).

Could you elaborate on that? Is your package such that the main cmake files (that would be used downstream) is from sdist instead of the git repo/archive? If not, then I don't think you need to worry about that. Simply define the variables only within scikit-build-core, e.g.:

[tool.scikit-build.cmake.define]
CPM_SOURCE_CACHE = "third_party"

Sidenote: downstream should not be using upstream's cpm imo because it does not the same overwriteability that FetchContent has, i.e. only the parent project dictates the dependencies (versions). But that's a problem of one-camp-over-another.

@MuellerSeb
Copy link
Author

@ LecrisUT Maybe you are right.

But I guess I would set CPM_SOURCE_CACHE from cmake depending on ${SKBUILD_STATE}.

So for me this would be solved by being able to run the cmake configure stage when building sdist.

@henryiii henryiii mentioned this issue Aug 9, 2023
1 task
henryiii added a commit that referenced this issue Aug 10, 2023
Working on #392.

- [x] Add to setuptools build_meta, too

For later PR:

```diff
+    libpl = sysconfig.get_config_var("LIBPL")
+    library = sysconfig.get_config_var("LIBRARY")
+
+    if libpl and library and Path(libpl).joinpath(library).is_file():
+        assert isinstance(libpl, str)
+        assert isinstance(library, str)
+        return Path(libpl) / library
```

Signed-off-by: Henry Schreiner <[email protected]>
@henryiii
Copy link
Collaborator

Done in #454.

@MuellerSeb
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants