-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic standard library support. #7216
Conversation
☔ The latest upstream changes (presumably #7215) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some amazing progress @ehuss, can't thank you enough again for leading the charge here on this!
For future PRs
These are some amazingly detailed and thorough thougts. I'd want to make sure they're not lost! Are you figuring that these would become issues on the wg-std-aware Cargo repository after this PR is merged?
compiler_builtins may need to be built without debug assertions, see this. Could maybe use profile overrides.
This is also true of panic_unwind
and panic_abort
sorta. What we really need to do here is to delete src/bootstrap/bin/rustc.rs
in the Rust repository as much as possible. We should be far more aggressively leading on RUSTFLAGS
and for things like compiler-builtins/panic-abort we should be leaning more on profile overrides.
I think this is fine for now in general to be clear, but I'd like to see more config codified in rust-lang/rust's Cargo.toml
that we can reuse here in Cargo.
(reading your comments I think you and I are already thinking the same thing...)
It could maybe look at the profile to determine which panic implementation to use?
This matches my thinking as well. I think, however, we should probably "stabilize" an interface between Cargo/libstd about this. For example if libstd is built in unwinding mode it should activate everything necessary to get the panic_unwind
runtime working. If built in abort mode it should skip all that. Ideally I think we can basically get by without explicitly hardcoding anything about the panic_*
crates but just with libstd features in the long run.
Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future.
FWIW I think your refactoring of removing Resolve
from the backend was great here and it feels pretty clean to me. This seems like a reasonable-ish way to handle this going forward, even possibly through to stabilization.
Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
I'm curious to hear more about this, what else is needed for these platforms? I think like musl/wasi will have difficulty in the sense that they assume a C standard library is already built, but are you thinking of other issues? (I'm not sure what the difficulty is with windows-gnu for example)
Profile overrides cause weird linker errors.
Mind sending me a gist of what this looks like?
I went with the strategy of using --extern to link the standard lib.
I wrote this in the comments inline below as well, but I think that --extern
is probably the right solution, even in the long-term. I do think, however, that we may want to emulate --sysroot
by basically passing --extern
to all sysroot crates to all compilations which I think should solve some of the issues about hardcoding too much in Cargo perhaps?
How should this be tested?
This is a really good question!
I think I'd ideally like to see two forms of testing. One is that it actually uses the rust-lang/rust components and such and runs full integration tests. I think these tests can be pretty few and far between though.
Otherwise I think it'd be good if we can create a mock "libstd dependency set" for Cargo itself which we use in tests. Sort of like how we stub out the registry. Ideally these mock crates would just reexport the actual core/std crates in Rust itself (somehow? Maybe some magical test-only feature that makes them visible?) and would be empty crates otherwise to compile very quickly.
How to test that it is actually linking the correct standard library?
With the idea of a "mock root set up by Cargo" this would be easy enough to test perhaps?
I did not add __CARGO_DEFAULT_LIB_METADATA to the hash. I had a hard time coming up with a test case where it would matter.
This is fine I think to cover later, this feature is purely a "let's just smash things until it works" feature so if it works otherwise it's not needed.
Very dumb question: Why exactly does libstd include a dylib?
Otherwise it's impossible to produce dynamic programs at all in Rust (e.g. rustc wouldn't exist). It's a bad reason.
Rust, oh-so-long ago, only supported the dylib
crate type. Later rlibs were added and they were chosen as the default for crates. The compiler has always been dynamically linked, with librustc_driver.so
being shared between the rustc.exe
and rustdoc.exe
executables. Additionally librustc_driver.so
is used by rls.exe
nowadays too.
For almost all use cases other than the compiler itself having a dylib is basically extraneous and not helpful.
It's an extremely longstanding bug in Cargo that you can't configure crate types at build time. For std-aware cargo there's basically zero reason for Cargo to produce a dynamic library of the standard library.
/// Parse the `-Zbuild-std` flag. | ||
pub fn parse_unstable_flag(value: Option<&str>) -> Vec<String> { | ||
// This is a temporary hack until there is a more principled way to | ||
// declare dependencies in Cargo.toml. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely is fine for a first pass, it gets anything working! In the long run though I think we're going to want to rely on stable crate names like std
and core
(and maybe compiler_builtins
?) and then work backwards from them in the crate graph to figure out what else to link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely! Especially if we automatically pass every sysroot crate in, it won't be necessary.
spec_pkgs.push("test".to_string()); | ||
let spec = Packages::Packages(spec_pkgs); | ||
let specs = spec.to_package_id_specs(&std_ws)?; | ||
let features = vec!["panic-unwind".to_string(), "backtrace".to_string()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic-unwind
here is particularly tricky, but I think that long-term we're definitely going to want to select this based on the profile we're going to be compiling with. For example if we compile libstd with panic=abort we shouldn't use this crate, but if we do we should use the crate. Now knowing whether we're compiling with panic=abort is actually pretty tricky because it's defaulted to abort on a bunch of targets (like wasm).
Thanks for taking a look!
Yes. I wanted to hold off in case any of them became non-issues through the course of updating this PR.
Yep. My goal with this was to try to avoid needing to make any changes to rust-lang/rust at first. We can then start rolling out any changes there that will make this easier later.
Hm, you just made me think of something. When running
I'm not sure, I didn't even try. I'm just thinking of things like this and this and this. I'm not sure if that is actually needed, but I suspect there will be some rough edges for some platforms.
Here: https://gist.github.com/ehuss/6973e2fad5663f10592f71de08b3eff0 |
Oh dear, that's a good point! I think though we'd functionally want to do what libstd does today which is to compile everything with
Ah ok I see what you mean. I do think that those will cause issues and will be good to track! Of course not a blocker for this though
Hm ok, that does indeed look horrifyingly confusing. I'll try to help debug once this goes through as well |
matklad had an interesting idea for a related problem here: rust-lang/rust#63484 (comment) |
Hm so in theory optimization settings should affect symbol visibility to the point of causing linker errors. That's just theory though, and libstd is always a special child, so there's probably something wrong here or there. I would need to dig into it more locally though to figure it out. |
FWIW I'm personally feeling pretty good about landing this, so @ehuss I don't mind waiting if you've got blockers you'd prefer to handle first but I think it's otherwise fine to start opening issues and adding comments for what should be removed when and then r+ this |
Since its inception rustbuild has always worked in three stages: one for libstd, one for libtest, and one for rustc. These three stages were architected around crates.io dependencies, where rustc wants to depend on crates.io crates but said crates don't explicitly depend on libstd, requiring a sysroot assembly step in the middle. This same logic was applied for libtest where libtest wants to depend on crates.io crates (`getopts`) but `getopts` didn't say that it depended on std, so it needed `std` built ahead of time. Lots of time has passed since the inception of rustbuild, however, and we've since gotten to the point where even `std` itself is depending on crates.io crates (albeit with some wonky configuration). This commit applies the same logic to the two dependencies that the `test` crate pulls in from crates.io, `getopts` and `unicode-width`. Over the many years since rustbuild's inception `unicode-width` was the only dependency picked up by the `test` crate, so the extra configuration necessary to get crates building in this crate graph is unlikely to be too much of a burden on developers. After this patch it means that there are now only two build phasese of rustbuild, one for libstd and one for rustc. The libtest/libproc_macro build phase is all lumped into one now with `std`. This was originally motivated by rust-lang/cargo#7216 where Cargo was having to deal with synthesizing dependency edges but this commit makes them explicit in this repository.
…=Mark-Simulacrum bootstrap: Merge the libtest build step with libstd Since its inception rustbuild has always worked in three stages: one for libstd, one for libtest, and one for rustc. These three stages were architected around crates.io dependencies, where rustc wants to depend on crates.io crates but said crates don't explicitly depend on libstd, requiring a sysroot assembly step in the middle. This same logic was applied for libtest where libtest wants to depend on crates.io crates (`getopts`) but `getopts` didn't say that it depended on std, so it needed `std` built ahead of time. Lots of time has passed since the inception of rustbuild, however, and we've since gotten to the point where even `std` itself is depending on crates.io crates (albeit with some wonky configuration). This commit applies the same logic to the two dependencies that the `test` crate pulls in from crates.io, `getopts` and `unicode-width`. Over the many years since rustbuild's inception `unicode-width` was the only dependency picked up by the `test` crate, so the extra configuration necessary to get crates building in this crate graph is unlikely to be too much of a burden on developers. After this patch it means that there are now only two build phasese of rustbuild, one for libstd and one for rustc. The libtest/libproc_macro build phase is all lumped into one now with `std`. This was originally motivated by rust-lang/cargo#7216 where Cargo was having to deal with synthesizing dependency edges but this commit makes them explicit in this repository.
Since its inception rustbuild has always worked in three stages: one for libstd, one for libtest, and one for rustc. These three stages were architected around crates.io dependencies, where rustc wants to depend on crates.io crates but said crates don't explicitly depend on libstd, requiring a sysroot assembly step in the middle. This same logic was applied for libtest where libtest wants to depend on crates.io crates (`getopts`) but `getopts` didn't say that it depended on std, so it needed `std` built ahead of time. Lots of time has passed since the inception of rustbuild, however, and we've since gotten to the point where even `std` itself is depending on crates.io crates (albeit with some wonky configuration). This commit applies the same logic to the two dependencies that the `test` crate pulls in from crates.io, `getopts` and `unicode-width`. Over the many years since rustbuild's inception `unicode-width` was the only dependency picked up by the `test` crate, so the extra configuration necessary to get crates building in this crate graph is unlikely to be too much of a burden on developers. After this patch it means that there are now only two build phasese of rustbuild, one for libstd and one for rustc. The libtest/libproc_macro build phase is all lumped into one now with `std`. This was originally motivated by rust-lang/cargo#7216 where Cargo was having to deal with synthesizing dependency edges but this commit makes them explicit in this repository.
Basic standard library support. This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work. This adds a flag `-Zbuild-std` to build the standard library with a project. The flag can also take optional comma-separated crate names, like `-Zbuild-std=core`. Default is `std,core,panic_unwind,compiler_builtins`. Closes rust-lang/wg-cargo-std-aware#10. Note: I can probably break some of the refactoring into smaller PRs if necessary. ## Overview The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this: - A synthetic workspace for the standard library is created to set up the patches and members correctly. - Decouple `unit_dependencies` from `Context` to make it easier to manage. - Add `features` to `Unit` to keep it unique and to remove the need to query a resolver. - Add a `UnitDep` struct which encodes the edges between `Unit`s. This removes the need to query a resolver for `extern_crate_name` and `public`. - Remove `Resolver` from `BuildContext` to avoid any confusion and to keep the complexity focused in `unit_dependencies`. - Remove `Links` from `Context` since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think `compiler-rt` is the only `links`? I currently went with a strategy of linking the standard library dependencies using `--extern` (instead of `--sysroot` or `-L`). This has some benefits but some significant drawbacks. See below for some questions. ## For future PRs - Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5 - Source is not downloaded. It assumes you have run `rustup component add rust-src`. See rust-lang/wg-cargo-std-aware#11 - `cargo metadata` does not include any information about std. I don't know how this should work. - `cargo clean` is not std-aware. - `cargo fetch` does not fetch std dependencies. - `cargo vendor` does not vendor std dependencies. - `cargo pkgid` is not std-aware. - `--target` is required on the command-line. This should default to host-as-target. - `-p` is not std aware. - A synthetic `Cargo.toml` workspace is created which has to know about things like `rustc-std-workspace-core`. Perhaps rust-lang/rust should publish the source with this `Cargo.toml` already created? - `compiler_builtins` uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15 - `compiler_builtins` may need to be built without debug assertions, see [this](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L210-L214). Could maybe use profile overrides. - Panic issues: - `panic_abort` is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details. - [This](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L186-L201) should probably be handled where `panic` is set for `panic_abort` and `compiler_builtins`. I would like to get a test case for it. This can maybe be done with profile overrides? - Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12 - Features are hard-coded. See rust-lang/wg-cargo-std-aware#13 - Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.). - Default `backtrace` is used with C compiler. See rust-lang/wg-cargo-std-aware#16 - Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17 - proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses `--sysroot` instead. - Profile overrides cause weird linker errors. That is: ```toml [profile.dev.overrides.std] opt-level = 2 ``` Using `[profile.dev.overrides."*"]` works. I tried fiddling with it, but couldn't figure it out. We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for `std` and `core` and `alloc` and everything else would not be ideal. - ~~`Context::unit_deps` does not handle build overrides, see #7215.~~ FIXED ## Questions for this PR - I went with the strategy of using `--extern` to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like `compiler_builtins` and `panic_*`. The alternative is to create a sysroot and copy all the crates to that directory and pass `--sysroot`. However, this is complicated by pipelining, which would require special support to copy `.rmeta` files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using `--sysroot` may be safer, but adds more complexity. - As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via `--extern`, then it results in duplicate lang items. For example, saying `extern crate proc_macro;` without specifying `proc_macro` as a dependency. We could prevent rustc from ever trying by passing `--sysroot=/nonexistent` to prevent it from trying. Or add an equivalent flag to rustc. - How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with. - What interesting things are there to test? - We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent. - How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite. - I did not add `__CARGO_DEFAULT_LIB_METADATA` to the hash. I had a hard time coming up with a test case where it would matter. - My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't. - Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443). - This should probably change, but I want to better understand it first. - The `bin_nostd` test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
☀️ Test successful - checks-azure |
Update cargo, books ## cargo 8 commits in 22f7dd0495cd72ce2082d318d5a9b4dccb9c5b8c..fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d 2019-08-27 16:10:51 +0000 to 2019-09-04 00:51:27 +0000 - Rename `--all` to `--workspace` (rust-lang/cargo#7241) - Basic standard library support. (rust-lang/cargo#7216) - Allow using 'config.toml' instead of just 'config' files. (rust-lang/cargo#7295) - Retry on SSL Connect Error. (rust-lang/cargo#7318) - minimal-copy `deserialize` for `InternedString` (rust-lang/cargo#7310) - Fix typo in cargo vendor examples (rust-lang/cargo#7320) - Fixes around multiple `[patch]` per crate (rust-lang/cargo#7303) - Improve error messages on mkdir failure (rust-lang/cargo#7306) ## reference 7 commits in d191a0c..090c015 2019-08-15 08:42:23 +0200 to 2019-09-03 13:59:28 -0700 - Fix rust-lang/reference#664: Review Oxford comma usage. (rust-lang/reference#668) - Fix some links. (rust-lang/reference#667) - Remove trait object warning. (rust-lang/reference#666) - Specify pattern types in `let` statements and `for` expressions (rust-lang/reference#663) - Fix loop expression link. (rust-lang/reference#662) - async-await initial reference material (rust-lang/reference#635) - Correct errors in the reference of extern functions definitions and declarations (rust-lang/reference#652) ## rust-by-example 1 commits in 580839d90aacd537f0293697096fa8355bc4e673..e76be6b2dc84c6a992e186157efe29d625e29b94 2019-08-17 23:17:50 -0300 to 2019-09-03 07:42:26 -0300 - Change link to russian translation repository (rust-lang/rust-by-example#1245) ## embedded-book 1 commits in 432ca26686c11d396eed6a59499f93ce1bf2433c..5ca585c4a7552efb546e7681c3de0712f4ae4fdc 2019-08-09 23:20:22 +0000 to 2019-08-27 13:39:14 +0000 - Fixup book CI (rust-embedded/book#205)
Congrats and thank you!! |
Thanks a lot for implementing this! I don't know if this is the right place to share feedback, so please tell me if there is a more appropriate place! I tried
In summary, I would say that this MVP is already able to replace most uses of |
You should be able to set |
@ehuss Thanks for the suggestion. Setting |
…lexcrichton [-Zbuild-std] Only build libtest when libstd is built Currently `libtest` is always compiled when a compilation unit uses a test harness. This implicitly adds builds the standard library too because `libtest` depends on it. This breaks the use of custom test frameworks in `no_std` crates as reported in #7216 (comment). This pull request fixes the issue by only building `libtest` if `libstd` is built. This makes sense in my opinion because when the user explicitly specified `-Zbuild-std=core`, they probably don't want to build the full standard library and rather get a compilation error when they accidentally use `libtest`.
This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.
This adds a flag
-Zbuild-std
to build the standard library with a project. The flag can also take optional comma-separated crate names, like-Zbuild-std=core
. Default isstd,core,panic_unwind,compiler_builtins
.Closes rust-lang/wg-cargo-std-aware#10.
Note: I can probably break some of the refactoring into smaller PRs if necessary.
Overview
The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:
unit_dependencies
fromContext
to make it easier to manage.features
toUnit
to keep it unique and to remove the need to query a resolver.UnitDep
struct which encodes the edges betweenUnit
s. This removes the need to query a resolver forextern_crate_name
andpublic
.Resolver
fromBuildContext
to avoid any confusion and to keep the complexity focused inunit_dependencies
.Links
fromContext
since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I thinkcompiler-rt
is the onlylinks
?I currently went with a strategy of linking the standard library dependencies using
--extern
(instead of--sysroot
or-L
). This has some benefits but some significant drawbacks. See below for some questions.For future PRs
rustup component add rust-src
. See Downloading source wg-cargo-std-aware#11cargo metadata
does not include any information about std. I don't know how this should work.cargo clean
is not std-aware.cargo fetch
does not fetch std dependencies.cargo vendor
does not vendor std dependencies.cargo pkgid
is not std-aware.--target
is required on the command-line. This should default to host-as-target.-p
is not std aware.Cargo.toml
workspace is created which has to know about things likerustc-std-workspace-core
. Perhaps rust-lang/rust should publish the source with thisCargo.toml
already created?compiler_builtins
uses default features (pure Rust implementation, etc.). See Deps: compiler_builtins wg-cargo-std-aware#15compiler_builtins
may need to be built without debug assertions, see this. Could maybe use profile overrides.panic_abort
is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.panic
is set forpanic_abort
andcompiler_builtins
. I would like to get a test case for it. This can maybe be done with profile overrides?Cargo.lock
and dependency resolution wg-cargo-std-aware#12backtrace
is used with C compiler. See Deps: backtrace wg-cargo-std-aware#16--sysroot
instead.That is:
[profile.dev.overrides."*"]
works. I tried fiddling with it, but couldn't figure it out.We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for
std
andcore
andalloc
and everything else would not be ideal.FIXEDContext::unit_deps
does not handle build overrides, see Clean up build script stuff and documentation. #7215.Questions for this PR
--extern
to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies likecompiler_builtins
andpanic_*
. The alternative is to create a sysroot and copy all the crates to that directory and pass--sysroot
. However, this is complicated by pipelining, which would require special support to copy.rmeta
files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using--sysroot
may be safer, but adds more complexity.--extern
, then it results in duplicate lang items. For example, sayingextern crate proc_macro;
without specifyingproc_macro
as a dependency. We could prevent rustc from ever trying by passing--sysroot=/nonexistent
to prevent it from trying. Or add an equivalent flag to rustc.__CARGO_DEFAULT_LIB_METADATA
to the hash. I had a hard time coming up with a test case where it would matter.bin_nostd
test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.