Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustup #3182

Merged
merged 54 commits into from
Nov 21, 2023
Merged

Rustup #3182

merged 54 commits into from
Nov 21, 2023

Conversation

RalfJung
Copy link
Member

No description provided.

workingjubilee and others added 30 commits October 28, 2023 23:24
…signed and unsigned shift amounts in the same branch
new solver normalization improvements

cool beans

At the core of this PR is a `try_normalize_ty` which stops for rigid aliases by using `commit_if_ok`.

Reworks alias-relate to fully normalize both the lhs and rhs and then equate the resulting rigid (or inference) types. This fixes rust-lang/trait-system-refactor-initiative#68 by avoiding the exponential blowup. Also supersedes #116369 by only defining opaque types if the hidden type is rigid.

I removed the stability check in `EvalCtxt::evaluate_goal` due to rust-lang/trait-system-refactor-initiative#75. While I personally have opinions on how to fix it, that still requires further t-types/`@nikomatsakis` buy-in, so I removed that for now. Once we've decided on our approach there, we can revert this commit.

r? `@compiler-errors`
ignore implied bounds with placeholders

given the following code:
```rust
trait Trait {
    type Ty<'a> where Self: 'a;
}

impl<T> Trait for T {
    type Ty<'a> = () where Self: 'a;
}

struct Foo<T: Trait>(T)
where
    for<'x> T::Ty<'x>: Sized;
```

when computing the implied bounds from `Foo<X>` we incorrectly get the bound `X: !x` from the normalization of ` for<'x> <X as Trait>::Ty::<'x>: Sized`. This is a a known bug! we shouldn't use the constraints that arise from normalization as implied bounds. See #109628.

Ignore these bounds for now. This should prevent later ICEs.

Fixes #112250
Fixes #107409
feat: implement `DoubleEndedSearcher` for `CharArray[Ref]Searcher`

This PR implements `DoubleEndedSearcher` for both `CharArraySearcher` and `CharArrayRefSearcher`. I'm not sure whether this was just overlooked or if there is a reason for it, but since it behaves exactly like `CharSliceSearcher`, I think the implementations should be appropriate.
Remove asmjs

Fulfills [MCP 668](rust-lang/compiler-team#668).

`asmjs-unknown-emscripten` does not work as-specified, and lacks essential upstream support for generating asm.js, so it should not exist at all.
Emit smir

This adds ability to `-Zunpretty=smir` and get smir output of a Rust file, this is obliviously pretty basic compared to `mir` output but I think we could iteratively improve it, and even at this state this is useful for us.

r? ``@celinval``
When using existing fn as module, don't claim it doesn't exist

Tweak wording of module not found in resolve, when the name exists but belongs to a non-`mod` item.

Fix #81232.
clarify `fn discriminant` guarantees: only free lifetimes may get erased

cc https://github.com/rust-lang/rust/pull/104299/files#r1397082347

don't think this necessitates a backport by itself, but should imo be included if one were to exist.

r? types
Add stable mir members to triagebot config

I also added the two crates from the project to `[assign.owners]` so it automatically assign to a project member changes to those crates.
Rollup of 7 pull requests

Successful merges:

 - #117338 (Remove asmjs)
 - #117549 (Use `copied` instead of manual `map`)
 - #117745 (Emit smir)
 - #117964 (When using existing fn as module, don't claim it doesn't exist)
 - #118006 (clarify `fn discriminant` guarantees: only free lifetimes may get erased)
 - #118016 (Add stable mir members to triagebot config)
 - #118022 (Miri subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
Add T: ?Sized to `RwLockReadGuard` and `RwLockWriteGuard`'s Debug impls.

For context, `MutexGuard` has `+ ?Sized` on its `Debug` impl, and all three have `+ ?Sized` on their `Display` impls.

It looks like the `?Sized` was just missed when the impls were added (the impl for `MutexGuard` was added in the same PR (rust-lang/rust#38006) with support for `T: Debug + ?Sized`, and `RwLock*Guard`s did allow `T: ?Sized` types already); the `Display` impls were added later (rust-lang/rust#42822) with support for `T: Debug + ?Sized` types.

I think this needs a T-libs-api FCP? I'm not sure if this also needs an ACP. If so I can make one.

These are changes to (stable) trait impls on stable types so will be insta-stable.

`@rustbot` label +T-libs-api
…into_warnings, r=compiler-errors

Add some additional warnings for duplicated diagnostic items

This commit adds warnings if a user supplies several diagnostic options where we can only apply one of them. We explicitly warn about ignored options here. In addition a small test for these warnings is added.

r? `@compiler-errors`

For now that's the last PR to improve the warnings generated by misused `#[diagnostic::on_unimplemented]` attributes. I'm not sure what needs to be done next to move this closer to stabilization.
Reenable effects in libcore

With #116670, #117531, and #117171, I think we would be comfortable with re-enabling the effects feature for more testing in libcore.

r? `@oli-obk`
cc `@fmease`
cc #110395
…=scottmcm

Expose tests for {f32,f64}.total_cmp in docs

Expose tests for {f32,f64}.total_cmp in docs

Uncomment the helpful `assert_eq!` line, which is stripped out completely in docs, and leaves the reader to mentally play through the algorithm, or go to the playground and add a println!, to see what the result will be.

(If these tests are known to fail on some platforms, is there some mechanism to conditionalize this or escape the test so the `assert_eq!` source will be visible on the web? I am a newbie, which is why I was reading docs ;)
When a local binding shadows a fn, point at fn def in call failure

When a local binding shadows a function that is then called, this local binding will cause an E0618 error. We now point not only at the binding definition, but also at the locally defined function of the same name.

```
error[E0618]: expected function, found `&str`
  --> $DIR/issue-22468.rs:3:13
   |
LL |     let foo = "bar";
   |         --- `foo` has type `&str`
LL |     let x = foo("baz");
   |             ^^^-------
   |             |
   |             call expression requires function
...
LL | fn foo(file: &str) -> bool {
   | -------------------------- this function of the same name is available here, but it shadowed by the local binding of the same name
```

Fix #53841
Remove option_payload_ptr; redundant to offset_of

The `option_payload_ptr` intrinsic is no longer required as `offset_of` supports traversing enums (#114208). This PR removes it in order to dogfood offset_of (as suggested at rust-lang/rust#106655 (comment)). However, it will not build until those changes reach beta (which I think is within the next 8 days?) so I've opened it as a draft.
Unify "input" and "no input" paths in `run_compiler`

A follow-up to #117649.

r? `@bjorn3`
…r=Mark-Simulacrum

deprecate `if-available` value of `download-ci-llvm`

This PR deprecates the use of the `if-available` value for `download-ci-llvm` since `if-unchanged` serves the same purpose when no changes are detected. In cases where changes are present, it is assumed that compiling LLVM is acceptable (otherwise, why make changes there?).

This was probably missing in the #110087 issue before.

cc `@RalfJung`
Set `CFG_OMIT_GIT_HASH=1` during builds when `omit-git-hash` is enabled

This environment variable will allow tools like Cargo to disable their own detection when `omit-git-hash` is set to `true`.

I created this PR because of rust-lang/cargo#12968. There is not a dependency between the two PRs, they can land in any order. They just won't do anything until both of them are merged into the repo.
Adjust frame IP in backtraces relative to image base for SGX target

This is followup to rust-lang/backtrace-rs#566.

The backtraces printed by `panic!` or generated by `std::backtrace::Backtrace` in SGX target are not usable. The frame addresses need to be relative to image base address so they can be used for symbol resolution. Here's an example panic backtrace generated before this change:

```
$ cargo r --target x86_64-fortanix-unknown-sgx
...
stack backtrace:
   0:     0x7f8fe401d3a5 - <unknown>
   1:     0x7f8fe4034780 - <unknown>
   2:     0x7f8fe401c5a3 - <unknown>
   3:     0x7f8fe401d1f5 - <unknown>
   4:     0x7f8fe401e6f6 - <unknown>
```
Here's the same panic after this change:
```
$ cargo +stage1 r --target x86_64-fortanix-unknown-sgx
stack backtrace:
   0:            0x198bf - <unknown>
   1:            0x3d181 - <unknown>
   2:            0x26164 - <unknown>
   3:            0x19705 - <unknown>
   4:            0x1ef36 - <unknown>
```
cc `@jethrogb` and `@workingjubilee`
…avidtwco

Remove legacy bitcode defaults from all Apple specs

Xcode 14 [deprecated bitcode with warnings](https://developer.apple.com/documentation/xcode-release-notes/xcode-14-release-notes#Deprecations) and now [Xcode 15 has dropped it completely](https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Deprecations). `rustc` should follow what the platform tooling is doing as well since it just increases binary sizes for no gain at this point.

`cc` made a [similar change last month](rust-lang/cc-rs#812).

Two things show this should have minimal impact:
- Apple has stopped accepting apps built with versions of Xcode (<14) that generate bitcode
- The app store has been stripping bitcode off IPA releases for over 2 years now.

I didn't nuke all the bitcode changes added in rust-lang/rust#71970 since maybe another target in the future could need mandatory bitcode embedding.

Staticlibs built for iOS still link correctly with XCode 15 against a test app when using a compiler built from this branch.

cc `@thomcc` `@keith`
patterns: don't ice when encountering a raw str slice

Fixes rust-lang/rust#117806
…Gomez

rustdoc-search: optimize unifyFunctionTypes

Final profile output:
https://notriddle.com/rustdoc-html-demo-5/profile-4/index.html

This PR contains three commits that improve performance of this hot inner loop: reduces the number of allocations, a fast path for the 1-element basic query case, and reconstructing the multi-element query case to use recursion instead of an explicit `backtracking` array. It also adds new test cases that I found while working on this.

r? `@GuillaumeGomez`
doc: add release notes to standalone doc bundle

Preview: http://notriddle.com/rustdoc-html-demo-5/release-notes/releases.html

This is a workaround for #101714 on top of being a useful addition in its own right. It is intended to change the "canonical URL" for viewing the release notes from GitHub, which is relatively slow, to a pre-rendered HTML file that loads from the same CDN as the standard library docs. It also means you get a copy of the release notes when installing the rust-docs with rustup.
Ensure sanity of all computed ABIs

This moves the ABI sanity assertions from the codegen backend to the ABI computation logic. Sadly, due to past mistakes, we [have to](rust-lang/rust#117351 (comment)) be able to compute a sane ABI for nonsensical function types like `extern "C" fn(str) -> str`.  So to make the sanity check pass we first need to make all ABI adjustment deal with unsized types... and we have no shared infrastructure for those adjustments, so that's a bunch of copy-paste. At least we have assertions failing loudly when one accidentally sets a different mode for an unsized argument.

To achieve this, this re-lands the parts of rust-lang/rust#80594 that got reverted in rust-lang/rust#81388.  To avoid breaking wasm ABI again, that ABI now explicitly opts-in to the (wrong, broken) ABI that we currently keep for backwards compatibility. That's still better than having *every* ABI use the wrong broken default!

Cc `@bjorn3`
Fixes rust-lang/rust#115845
bors and others added 24 commits November 19, 2023 22:55
…ackh726

Begin to abstract `rustc_type_ir` for rust-analyzer

This adds the "nightly" feature which is used by the compiler, and falls back to more simple implementations when that is not active.

r? `@lcnr` or `@jackh726`
Add arm64e-apple-ios & arm64e-apple-darwin targets

This introduces

*  `arm64e-apple-ios`
*  `arm64e-apple-darwin`

Rust targets for support `arm64e` architecture on `iOS` and `Darwin`.

So, this is a first approach for integrating to the Rust compiler.

## Tier 3 Target Policy

> * A tier 3 target must have a designated developer or developers (the "target
maintainers") on record to be CCed when issues arise regarding the target.
(The mechanism to track and CC such developers may evolve over time.)

I will be the target maintainer.

> * Targets must use naming consistent with any existing targets; for instance, a
target for the same CPU or OS as an existing Rust target should use the same
name for that CPU or OS. Targets should normally use the same names and
naming conventions as used elsewhere in the broader ecosystem beyond Rust
(such as in other toolchains), unless they have a very good reason to
diverge. Changing the name of a target can be highly disruptive, especially
once the target reaches a higher tier, so getting the name right is important
even for a tier 3 target.
Target names should not introduce undue confusion or ambiguity unless
absolutely necessary to maintain ecosystem compatibility. For example, if
the name of the target makes people extremely likely to form incorrect
beliefs about what it targets, the name should be changed or augmented to
disambiguate it.
If possible, use only letters, numbers, dashes and underscores for the name.
Periods (.) are known to cause issues in Cargo.

The target names `arm64e-apple-ios`, `arm64e-apple-darwin` were derived from `aarch64-apple-ios`, `aarch64-apple-darwin`.
In this [ticket,](#73628) people discussed the best suitable names for these targets.

> In some cases, the arm64e arch might be "different". For example:
> * `thread_set_state` might fail with (os/kern) protection failure if we try to call it from arm64 process to arm64e process.
> * The returning value of dlsym is PAC signed on arm64e, while left untouched on arm64
> * Some function like pthread_create_from_mach_thread requires a PAC signed function pointer on arm64e, which is not required on arm64.

So, I have chosen them because there are similar triplets in LLVM. I think there are no more suitable names for these targets.

> * Tier 3 targets may have unusual requirements to build or use, but must not
create legal issues or impose onerous legal terms for the Rust project or for
Rust developers or users.
The target must not introduce license incompatibilities.
Anything added to the Rust repository must be under the standard Rust
license (MIT OR Apache-2.0).
The target must not cause the Rust tools or libraries built for any other
host (even when supporting cross-compilation to the target) to depend
on any new dependency less permissive than the Rust licensing policy. This
applies whether the dependency is a Rust crate that would require adding
new license exceptions (as specified by the tidy tool in the
rust-lang/rust repository), or whether the dependency is a native library
or binary. In other words, the introduction of the target must not cause a
user installing or running a version of Rust or the Rust tools to be
subject to any new license requirements.
Compiling, linking, and emitting functional binaries, libraries, or other
code for the target (whether hosted on the target itself or cross-compiling
from another target) must not depend on proprietary (non-FOSS) libraries.
Host tools built for the target itself may depend on the ordinary runtime
libraries supplied by the platform and commonly used by other applications
built for the target, but those libraries must not be required for code
generation for the target; cross-compilation to the target must not require
such libraries at all. For instance, rustc built for the target may
depend on a common proprietary C runtime library or console output library,
but must not depend on a proprietary code generation library or code
optimization library. Rust's license permits such combinations, but the
Rust project has no interest in maintaining such combinations within the
scope of Rust itself, even at tier 3.
"onerous" here is an intentionally subjective term. At a minimum, "onerous"
legal/licensing terms include but are not limited to: non-disclosure
requirements, non-compete requirements, contributor license agreements
(CLAs) or equivalent, "non-commercial"/"research-only"/etc terms,
requirements conditional on the employer or employment of any particular
Rust developers, revocable terms, any requirements that create liability
for the Rust project or its developers or users, or any requirements that
adversely affect the livelihood or prospects of the Rust project or its
developers or users.

No dependencies were added to Rust.

> * Neither this policy nor any decisions made regarding targets shall create any
binding agreement or estoppel by any party. If any member of an approving
Rust team serves as one of the maintainers of a target, or has any legal or
employment requirement (explicit or implicit) that might affect their
decisions regarding a target, they must recuse themselves from any approval
decisions regarding the target's tier status, though they may otherwise
participate in discussions.
>    * This requirement does not prevent part or all of this policy from being
cited in an explicit contract or work agreement (e.g. to implement or
maintain support for a target). This requirement exists to ensure that a
developer or team responsible for reviewing and approving a target does not
face any legal threats or obligations that would prevent them from freely
exercising their judgment in such approval, even if such judgment involves
subjective matters or goes beyond the letter of these requirements.

Understood.
I am not a member of a Rust team.

> * Tier 3 targets should attempt to implement as much of the standard libraries
as possible and appropriate (core for most targets, alloc for targets
that can support dynamic memory allocation, std for targets with an
operating system or equivalent layer of system-provided functionality), but
may leave some code unimplemented (either unavailable or stubbed out as
appropriate), whether because the target makes it impossible to implement or
challenging to implement. The authors of pull requests are not obligated to
avoid calling any portions of the standard library on the basis of a tier 3
target not implementing those portions.

Understood.
`std` is supported.

> * The target must provide documentation for the Rust community explaining how
to build for the target, using cross-compilation if possible. If the target
supports running binaries, or running tests (even if they do not pass), the
documentation must explain how to run such binaries or tests for the target,
using emulation if possible or dedicated hardware if necessary.

Building is described in the derived target doc.

> * Tier 3 targets must not impose burden on the authors of pull requests, or
other developers in the community, to maintain the target. In particular,
do not post comments (automated or manual) on a PR that derail or suggest a
block on the PR based on a tier 3 target. Do not send automated messages or
notifications (via any medium, including via `@)` to a PR author or others
involved with a PR regarding a tier 3 target, unless they have opted into
such messages.
>    * Backlinks such as those generated by the issue/PR tracker when linking to
an issue or PR are not considered a violation of this policy, within
reason. However, such messages (even on a separate repository) must not
generate notifications to anyone involved with a PR who has not requested
such notifications.

Understood.

> * Patches adding or updating tier 3 targets must not break any existing tier 2
or tier 1 target, and must not knowingly break another tier 3 target without
approval of either the compiler team or the maintainers of the other tier 3
target.
>     * In particular, this may come up when working on closely related targets,
such as variations of the same architecture with different features. Avoid
introducing unconditional uses of features that another variation of the
target may not have; use conditional compilation or runtime detection, as
appropriate, to let each target run code supported by that target.

These targets are not fully ABI compatible with arm64e code.

#73628
interpret: simplify handling of shifts by no longer trying to handle signed and unsigned shift amounts in the same branch

While we're at it, also update comments in codegen and MIR building related to shifts, and fix the overflow error printed by Miri on negative shift amounts.
Recover `dyn` and `impl` after `for<...>`

Recover `dyn` and `impl` after `for<...>` in types. Reuses the logic for parsing bare trait objects, so it doesn't fix cases like `for<'a> dyn Trait + dyn Trait` or anything, but that seems somewhat of a different issue.

Parsing recovery logic is a bit involved, but I couldn't find a way to simplify it.

Fixes #117882
if available use a Child's pidfd for kill/wait

This should get us closer to stabilization of pidfds since they now do something useful. And they're `CLOEXEC` now.

```
$ strace -ffe clone,sendmsg,recvmsg,execve,kill,pidfd_open,pidfd_send_signal,waitpid,waitid ./x test std --no-doc -- pidfd

[...]
running 1 tests
strace: Process 816007 attached
[pid 816007] pidfd_open(816006, 0)      = 3
[pid 816007] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0c6b787990) = 816008
strace: Process 816008 attached
[pid 816007] recvmsg(3,  <unfinished ...>
[pid 816008] pidfd_open(816008, 0)      = 3
[pid 816008] sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=0}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, 0) = 0
[pid 816007] <... recvmsg resumed>{msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=0}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[4]}], msg_controllen=24, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 0
[pid 816008] execve("/usr/bin/false", ["false"], 0x7ffcf2100048 /* 105 vars */) = 0
[pid 816007] waitid(P_PIDFD, 4,  <unfinished ...>
[pid 816008] +++ exited with 1 +++
[pid 816007] <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=816008, si_uid=1001, si_status=1, si_utime=0, si_stime=0}, WEXITED, NULL) = 0
[pid 816007] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=816008, si_uid=1001, si_status=1, si_utime=0, si_stime=0} ---
[pid 816007] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 816009 attached
, child_tidptr=0x7f0c6b787990) = 816009
[pid 816007] recvmsg(3,  <unfinished ...>
[pid 816009] pidfd_open(816009, 0)      = 3
[pid 816009] sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=0}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, 0) = 0
[pid 816007] <... recvmsg resumed>{msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=0}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[5]}], msg_controllen=24, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 0
[pid 816009] execve("/usr/bin/sleep", ["sleep", "1000"], 0x7ffcf2100048 /* 105 vars */) = 0
[pid 816007] waitid(P_PIDFD, 5, {}, WNOHANG|WEXITED, NULL) = 0
[pid 816007] pidfd_send_signal(5, SIGKILL, NULL, 0) = 0
[pid 816007] waitid(P_PIDFD, 5,  <unfinished ...>
[pid 816009] +++ killed by SIGKILL +++
[pid 816007] <... waitid resumed>{si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=816009, si_uid=1001, si_status=SIGKILL, si_utime=0, si_stime=0}, WEXITED, NULL) = 0
[pid 816007] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=816009, si_uid=1001, si_status=SIGKILL, si_utime=0, si_stime=0} ---
[pid 816007] +++ exited with 0 +++
```
Handle attempts to have multiple `cfg`d tail expressions

When encountering code that seems like it might be trying to have multiple tail expressions depending on `cfg` information, suggest alternatives that will success to parse.

```rust
fn foo() -> String {
    #[cfg(feature = "validation")]
    [1, 2, 3].iter().map(|c| c.to_string()).collect::<String>()
    #[cfg(not(feature = "validation"))]
    String::new()
}
```

```
error: expected `;`, found `#`
  --> $DIR/multiple-tail-expr-behind-cfg.rs:5:64
   |
LL |     #[cfg(feature = "validation")]
   |     ------------------------------ only `;` terminated statements or tail expressions are allowed after this attribute
LL |     [1, 2, 3].iter().map(|c| c.to_string()).collect::<String>()
   |                                                                ^ expected `;` here
LL |     #[cfg(not(feature = "validation"))]
   |     - unexpected token
   |
help: add `;` here
   |
LL |     [1, 2, 3].iter().map(|c| c.to_string()).collect::<String>();
   |                                                                +
help: alternatively, consider surrounding the expression with a block
   |
LL |     { [1, 2, 3].iter().map(|c| c.to_string()).collect::<String>() }
   |     +                                                             +
help: it seems like you are trying to provide different expressions depending on `cfg`, consider using `if cfg!(..)`
   |
LL ~     if cfg!(feature = "validation") {
LL ~         [1, 2, 3].iter().map(|c| c.to_string()).collect::<String>()
LL ~     } else if cfg!(not(feature = "validation")) {
LL ~         String::new()
LL +     }
   |
```

Fix #106020.

r? `@oli-obk`
…ence, r=lcnr

Ignore but do not assume region obligations from unifying headers in negative coherence

Partly addresses a FIXME that was added in #112875. Just as we can throw away the nested trait/projection obligations from unifying two impl headers, we can also just throw away the region obligations too.

I removed part of the FIXME that was incorrect, namely:
> Given that the only region constraints we get are involving inference regions in the root, it shouldn't matter, but still sus.

This is not true when unifying `fn(A)` and `for<'b> fn(&'b B)` which ends up with placeholder region outlives from non-root universes. I'm pretty sure this is okay, though it would be nice if we were to use them as assumptions. See the `explicit` revision of the test I committed, which still fails.

Fixes #117986

r? lcnr, feel free to reassign tho.
…aliemjay

Make regionck care about placeholders in outlives components

Currently, we don't consider a placeholder type `!T` to be a type component when it comes to processing type-outlives obligations. This means that they are essentially treated like unit values with no sub-components, and always outlive any region. This is problematic for `non_lifetime_binders`, and even more problematic for `with_negative_coherence`, since negative coherence uses placeholders as universals.

This PR adds `Component::Placeholder` which acts much like `Component::Param`. This currently causes a regression in some non-lifetime-binders tests because `for<T> T: 'static` doesn't imply itself when processing outlives obligations, so code like this will fail:

```
fn foo() where for<T> T: 'static {
  foo() //~ fails
}
```

Since the where clause doesn't imply itself. This requires making the `MatchAgainstHigherRankedOutlives` relation smarter when it comes to binders.

r? types
Rollup of 8 pull requests

Successful merges:

 - #117828 (Avoid iterating over hashmaps in astconv)
 - #117832 (interpret: simplify handling of shifts by no longer trying to handle signed and unsigned shift amounts in the same branch)
 - #117891 (Recover `dyn` and `impl` after `for<...>`)
 - #117957 (if available use a Child's pidfd for kill/wait)
 - #117988 (Handle attempts to have multiple `cfg`d tail expressions)
 - #117994 (Ignore but do not assume region obligations from unifying headers in negative coherence)
 - #118000 (Make regionck care about placeholders in outlives components)
 - #118068 (subtree update cg_gcc 2023/11/17)

r? `@ghost`
`@rustbot` modify labels: rollup
Add `$message_type` field to distinguish json diagnostic outputs

Currently the json-formatted outputs have no way to unambiguously determine which kind of message is being output. A consumer can look for specific fields in the json object (eg "message"), but there's no guarantee that in future some other kind of output will have a field of the same name.

This PR adds a `"type"` field to add json outputs which can be used to unambiguously determine which kind of output it is. The mapping is:

`diagnostic`: regular compiler diagnostics
`artifact`: artifact notifications
`future_incompat`: Future incompatibility report
`unused_extern`: Unused crate warnings/errors

This matches the "internally tagged" representation for serde enums.
Add `Duration::abs_diff`

This adds a `Duration::abs_diff` method analogous to the existing one on the primitive integers.

ACP: rust-lang/libs-team#291
Tracking Issue: rust-lang/rust#117618
…, r=GuillaumeGomez

rustdoc-search: add support for traits and associated types

# Summary

Trait associated type queries work in rustdoc's type driven search. The data is included in the search-index.js file, and the queries are designed to "do what I mean" when users type them in, so, for example, `Iterator<Item=T> -> Option<T>` includes `Iterator::next` in the SERP[^SERP], and `Iterator<T> -> Option<T>` also includes `Iterator::next` in the SERP.

[^SERP]: search engine results page

## Sample searches

* [`iterator<Item=T>, fnmut -> T`][iterreduce]
* [`iterator<T>, fnmut -> T`][iterreduceterse]

[iterreduce]: http://notriddle.com/rustdoc-html-demo-5/associated-types/std/index.html?search=iterator%3CItem%3DT%3E%2C%20fnmut%20-%3E%20T&filter-crate=std
[iterreduceterse]: http://notriddle.com/rustdoc-html-demo-5/associated-types/std/index.html?search=iterator%3CT%3E%2C%20fnmut%20-%3E%20T&filter-crate=std

# Motivation

My primary motivation for working on search.js at all is to make it easier to use highly generic APIs, like the Iterator API. The type signature describes these functions pretty well, while the names are almost arbitrary.

Before this PR, type bindings were not consistently included in search-index.js at all (you couldn't find Iterator::next by typing in its function signature) and you couldn't explicitly search for them. This PR fixes both of these problems.

# Guide-level explanation

*Excerpt from [the Rustdoc book](http://notriddle.com/rustdoc-html-demo-5/associated-types/rustdoc/read-documentation/search.html), included in this PR.*

> Function signature searches can query generics, wrapped in angle brackets, and traits will be normalized like types in the search engine if no type parameters match them. For example, a function with the signature `fn my_function<I: Iterator<Item=u32>>(input: I) -> usize` can be matched with the following queries:
>
> * `Iterator<Item=u32> -> usize`
> * `Iterator<u32> -> usize` (you can leave out the `Item=` part)
> * `Iterator -> usize` (you can leave out iterator's generic entirely)
> * `T -> usize` (you can match with a generic parameter)
>
> Each of the above queries is progressively looser, except the last one would not match `dyn Iterator`, since that's not a type parameter.

# Reference-level explanation

Inside the angle brackets, you can choose whether to write a name before the parameter and the equal sign. This syntax is called [`GenericArgsBinding`](https://doc.rust-lang.org/reference/paths.html#paths-in-expressions) in the Rust Reference, and it allows you to constrain a trait's associated type.

As a convenience, you don't actually have to put the name in (Rust requires it, but Rustdoc Search doesn't). This works about the same way unboxing already works in Search: the terse `Iterator<u32>` is a match for `Iterator<Item=u32>`, but the opposite is not true, just like `u32` is a match for `Iterator<u32>`.

When converting a trait method for the search index, the trait is substituted for `Self`, and all associated types are bound to generics. This way, if you have the following trait definition:

```rust
pub trait MyTrait {
    type Output;
    fn method(self) -> Self::Output;
}
```

The following queries will match its method:

  * `MyTrait<Output=T> -> T`
  * `MyTrait<T> -> T`
  * `MyTrait -> T`

But these queries will not match it:

  * <i>`MyTrait<Output=u32> -> u32`</i>
  * <i>`MyTrait<Output> -> Output`</i>
  * <i>`MyTrait -> MyTrait::Output`</i>

# Drawbacks

It's a little bit bigger:

```console
$ du before/search-index1.74.0.js after/search-index1.74.0.js
4020    before/search-index1.74.0.js
4068    after/search-index1.74.0.js
```

# Rationale and alternatives

I don't want to just not do this. On it's own, it's not terribly useful, but in addition to searching by normal traits, this is also intended as a desugaring target for closures. That's why it needs to actually distinguish the two: it allows the future desugaring to distinguish function output and input.

The other alternative would be to not allow users to leave out the name, so `iterator<u32>` doesn't work. That would be unfortunate, because mixing up which ones have out params and which ones are plain generics is an easy enough mistake that the Rust compiler itself helps people out with it.

# Prior art

  * <http://neilmitchell.blogspot.com/2020/06/hoogle-searching-overview.html>

    The current Rustdoc algorithm, both before this PR and after it, has a fairly expensive matching algorithm over a fairly simple file format. Luckily, we aren't trying to scale to all of crates.io, so it's usable, but it's not great when I throw it at docs.servo.org

# Unresolved questions

Okay, but *how do we want to handle closures?* I know the system will desugar `FnOnce(T) -> U` into `trait:FnOnce<Output=U, primitive:tuple<T>>`, but what if I don't know what trait I'm looking for? This PR can merge with nothing, but it'd be nice to have a plan.

Specifically, how should the special form used to handle all varieties of basic callable: primitive:fn (function pointers), and trait:Fn, trait:FnOnce, and trait:FnMut should all be searchable using a single syntax, because I'm always forgetting which one is used in the function I'm looking for.

The essential question is how closely we want to copy Rust's own syntax. The tersest way to expression Option::map might be:

    Option<T>, (T -> U) -> Option<U>

That's the approach I would prefer, but nobody's going to attempt it without being told, so maybe this would be better?

    Option<T>, (fn(T) -> U) -> Option<U>

It does require double parens, but at least it's mostly unambiguous. Unfortunately, it looks like the syntax you'd use for function pointers, implying that if you specifically wanted to limit your search to function pointers, you'd need to use `primitive:fn(T) -> U`. Then again, searching is normally case-insensitive, so you'd want that anyway to disambiguate from `trait:Fn(T) -> U`.

# Future possibilities

## This thing really needs a ranking algorithm

That is, this PR increases the number of matches for some type-based queries. They're usually pretty good matches, but there's still more of them, and it's evident that if you have two functions, `foo(MyTrait<u8>)` and `bar(MyTrait<Item=u8>)`, if the user typed `MyTrait<u8>` then `foo` should show up first.

A design choice that these PRs have followed is that adding more stuff to the search query always reduces the number of functions that get matched. The advantage of doing it that way is that you can rank them by just counting how many atoms are in the function's signature (lowest score goes on top). Since it's impossible for a matching function to have fewer atoms than the search query, if there's a function with exactly the same set of atoms in it, then that'll be on top.

More complicated ranking algos tend to penalize long documents anyway, if the [distance metrics](https://www.benfrederickson.com/distance-metrics/?utm_source=flipboard&utm_content=other) I found through [Flipboard](https://flipboard.com/`@arnie0426/building-recommender-systems-nvue3iqtgrn10t45)` (and postgresql's `ts_rank_cd`) are anything to go by. Real-world data sets tend to have weird outliers, like they have God Functions with zillions of arguments of all sorts of types, and Rustdoc ought to put such a function at the bottom.

The other natural choice would be interleaving with `unifyFunctionTypes` to count the number of unboxings and reorderings. This would compute a distance function, and would do a fine job of ranking the results, as [described here](https://ndmitchell.com/downloads/slides-hoogle_finding_functions_from_types-16_may_2011.pdf) by the Hoogle dev, but is more complicated than it sounds. The current algorithm returns when it finds a result that *exists at all*, but a distance function should find an *optimal solution* to find the smallest sequence of edits.

## This could also use a benchmark suite and some optimization

This approach also lends itself to layering a bloom filter in front of the backtracking unification engine.

* At load time, hash the typeNameIdMap ID for each atom and set the matching entry in a fixed-size byte array for each function to 1. Call it `fnType.bloomFilter`
* At search time, do the same for the atoms in the query (excluding special forms like `[]` that can match more than one thing). Call it `parsedQuery.bloomFilter`
* For each function, `if (fnType.bloomFilter | (~parsedQuery.bloomFilter) !== ~0) { return false; }`

There's also room to optimize the unification engine itself, by using stacks and persistent data structures instead of copying arrays around, or by using hashing instead of linear scans (the current algorithm was rewritten from one that tried to do that, but was too much to fit in my head and had a bunch of bugs). The advantage of Just Backtracking Better over the bloom filter is that it doesn't require the engine to retain any special algebraic properties.

But, first, we need a set of benchmarks to be able to judge if such a thing will actually help.

## Referring to associated types by path

*I don't want to implement this one, but if I did, this is how I'd do it.*

In Rust, this is represented by a structure called a qualified path, or QPath. They look like this:

    <Self as Iterator>::Item
    <F as FnOnce>::Output

They can also, if it's unambiguous, use a plain path and just let the system figure it out:

    Self::Item
    F::Output

In Rustdoc Type-Driven Search, we don't want to force people to be unambiguous. Instead, we should try *all reasonable interpretations*, return results whenever any of them match, and let users make their query more specific if too many results are matches.

To enable associated type path searches in Rustdoc, we need to:

1. When lowering a trait method to a search-index.js function signature, Self should be explicitly represented as a generic argument. It should always be assigned `-1`, so that if the user uses `Self` in their search query, we can ensure it always matches the real Self and not something else. Any functions that don't *have* a Self should drop a `0` into the first position of the where clause, to express that there isn't one and reserve the `-1` position.
   * Reminder: generics are negative, concrete types are positive, and zero is a reserved sentinel.
   * Right now, `Iterator::next` is lowered as if it were `fn next<T>(self: Iterator<Item=T>) -> Option<T>`.
     It should become `fn next<Self, T>(self: Self) -> Option<T> where Self: Iterator<Item=T>` instead.
3. Add another backtracking edge to the unification engine, so that when the user writes something like `some::thing`, the interpretation where `some` is a module and `thing` is a standalone item becomes one possible match candidate, while the interpretation where `some` is a trait and `thing` is an associated type is a separate match candidate. The backtracking engine is basically powerful enough to do this already, since unboxing generic type parameters into their traits already requires the ability to do this kind of thing.
   * When interpreting `some::thing` where `some` is a trait and `thing` is an associated type, it should be treated equivalently to `<self as some>::thing`. If you want to bind it to some generic parameter other than `Self`, you need to explicitly say so.
   * If no trait called `some` actually exists, treat it as a generic type parameter instead. Track every trait mentioned in the current working function signature, and add a match candidate for each one.
   * A user that explicitly wants the trait-associated-type interpretation could write a qpath (like `<self as trait>::type`), and a user that explicitly wants the module-item interpretation should use an item type filter (like `struct:module::type`).
4. To actually do the matching, maintain a `Map<(QueryGenericParamId, TraitId), FnGenericParamId>` alongside the existing `Map<QueryGenericParamId, FnGenericParamId>` that is already used to handle plain generic parameters. This works, because, when a trait function signature is lowered to search-index.js, the `rustdoc` backend always generates an FnGenericParamId for every trait associated type it sees mentioned in the function's signature.
5. Parse QPaths. Specifically,
   * QueryElem adds three new fields. `isQPath` is a boolean flag, and `traitNameId` contains an entry for `typeNameIdMap` corresponding to the trait part of a qpath, and `parentId` may contain either a concrete type ID or a negative number referring to a generic type parameter. The actual `id` of the query elem will always be a negative number, because this is essentially a funny way to add a generic type constraint.
   * If it's a QPath, then both of those IDs get filled in with the respective parts of the map. The unification engine will check the where clause to ensure the trait actually applies to the generic parameter in question, will check the type parameter constraint, and will add a mapping to `mgens` recording this as a solution.
   * If it's just a regular path, then `isQPath` is false, and the parser will fill in both `traitNameId` and `parentId` based on the same path. The unification engine, seeing isQPath is false and that these IDs were filled in, will try all three solutions: the path might be part of a concrete type name, or it might be referring to a trait, or it might be referring to a generic type parameter.

### Why not implement QPath searches?

I'm not sure if anybody really wants to write such complicated queries. You can do a pretty good job of describing the generic functions in the standard library without resorting to FQPs.

These two queries, for example, would both match the Iterator::map function if we added support for higher order function queries and a rule that allows a type to match its *notable traits*.

    // I like this version, because it's identical to how `Option::map` would be written.
    // There's a reason why Iterator::map and Option::map have the same name.
    Iterator<T>, (T -> U) -> Iterator<U>

    // This version explicitly uses the type parameter constraints.
    Iterator<Item=T>, (T -> U) -> Iterator<Item=U>

If I try to write this one using FQP, however, the results seem worse:

    // This one is less expressive than the versions that don't use associated type paths.
    // It matches `Iterator::filter`, while the above two example queries don't.
    Iterator, (Iterator::Item -> Iterator::Item) -> Iterator

    // This doesn't work, because the return type of `Iterator::map` is not a generic
    // parameter with an `Iterator` trait bound. It's a concrete type that
    // implements `Iterator`. Return-Position-Impl-Trait is the same way.
    //
    // There's a difference between something like `map`, whose return value
    // implements Iterator, and something like `collect`, where the caller
    // gets to decide what the concrete type is going to be.
    //Self, (Self::Item -> I::Item) -> I where Self: Iterator, I: Iterator

    // This works, but it seems subjectively ugly, complex, and counterintuitive to me.
    Self, (<Self as Iterator>::Item -> T) -> Iterator<Item=T>
Remove `--check-cfg` checking of command line `--cfg` args

Back in rust-lang/rust#100574 we added to the `unexpected_cfgs` lint the checking of `--cfg` CLI arguments and emitted unexpected names and values for them.

The implementation works as expected, but it's usability in particular when using it in combination with Cargo+`RUSTFLAGS` as people who set `RUSTFLAGS=--cfg=tokio_unstable` (or whatever) have `unexpected_cfgs` warnings on all of their crates is debatable. ~~To fix this issue this PR proposes that we split the CLI argument checking into it's own separate allow-by-default lint: `unexpected_cli_cfgs`.~~

~~This has the advantage of letting people who want CLI warnings have them (although not by default anymore), while still linting on every unexpected cfg name and values in the code.~~

After some discussion with the Cargo team ([Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/check-cfg.20and.20RUSTFLAGS.20interaction)) and member of the compiler team (see below), I propose that we follow the suggestion from `@epage:` never check `--cfg` arguments, but still reserve us the possibility to do it later.

We would still lint on unexpected cfgs found in the source code no matter the `--cfg` args passed. This mean reverting rust-lang/rust#100574 but NOT rust-lang/rust#99519.

r? `@petrochenkov`
Expand Miri's BorTag GC to a Provenance GC

As suggested in rust-lang#3080 (comment)

We previously solved memory growth issues associated with the Stacked Borrows and Tree Borrows runtimes with a GC. But of course we also have state accumulation associated with whole allocations elsewhere in the interpreter, and this PR starts tackling those.

To do this, we expand the visitor for the GC so that it can visit a BorTag or an AllocId. Instead of collecting all live AllocIds into a single HashSet, we just collect from the Machine itself then go through an accessor `InterpCx::is_alloc_live` which checks a number of allocation data structures in the core interpreter. This avoids the overhead of all the inserts that collecting their keys would require.

r? ``@RalfJung``
Fix early param lifetimes in generic_const_exprs

In cases like below, we never actually be able to capture region name for two reasons, first `'static` becomes anonymous lifetime and second we never capture region if it doesn't have a name so this results in ICE.
```
struct DataWrapper<'static> {
    data: &'a [u8; Self::SIZE],
}

impl DataWrapper<'a> {
```

Fixes rust-lang/rust#118021
…lbertlarsan68

Remove i686-apple-darwin cross-testing

The Xcode SDK no longer ships with 32-bit Intel (i686-apple-darwin) support as of [Xcode 14](https://developer.apple.com/news/upcoming-requirements/?id=06062022a) (related, #112753).  On an up-to-date Intel Mac, `x.py test --bless` fails.

r? ``@rust-lang/bootstrap``
Remove now deprecated target x86_64-sun-solaris.
Rollup of 6 pull requests

Successful merges:

 - #116085 (rustdoc-search: add support for traits and associated types)
 - #117522 (Remove `--check-cfg` checking of command line `--cfg` args)
 - #118029 (Expand Miri's BorTag GC to a Provenance GC)
 - #118035 (Fix early param lifetimes in generic_const_exprs)
 - #118083 (Remove i686-apple-darwin cross-testing)
 - #118091 (Remove now deprecated target x86_64-sun-solaris.)

r? `@ghost`
`@rustbot` modify labels: rollup
@RalfJung
Copy link
Member Author

@bors r+

@bors
Copy link
Contributor

bors commented Nov 21, 2023

📌 Commit 309133a has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Nov 21, 2023

⌛ Testing commit 309133a with merge f2f8b98...

@bors
Copy link
Contributor

bors commented Nov 21, 2023

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing f2f8b98 to master...

@bors bors merged commit f2f8b98 into rust-lang:master Nov 21, 2023
8 checks passed
@RalfJung RalfJung deleted the rustup branch November 22, 2023 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants