Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo clean -p package can clear the document of a single crates #9841

Closed
wants to merge 12 commits into from

Conversation

heisen-li
Copy link
Contributor

@heisen-li heisen-li commented Aug 26, 2021

The current command cannot delete the document of a single crates. It is simple to delete the document in the directory. However, the previously generated index still needs to be modified or rebuilt, otherwise a large number of broken links will be left.

Regarding this part, I think of two methods:

  1. Directly modify the generated index content;
  2. Rebuild the doc to cover the previous content.

Am I complicating the problem? Do you have any suggestions?

close #8790

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Eh2406 (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 26, 2021
@ehuss
Copy link
Contributor

ehuss commented Aug 26, 2021

I don't think this needs to bother with updating the index or anything like that. I think just deleting the directory is fine.

@heisen-li
Copy link
Contributor Author

I don't think this needs to bother with updating the index or anything like that. I think just deleting the directory is fine.

I'm very sorry, but I don't seem to fully understand you.
Just add the deletion of the target/doc directory? Do I understand correctly?
Is this the same as the cargo clean doc command?

@ehuss
Copy link
Contributor

ehuss commented Aug 27, 2021

Oh, I meant just the package's doc directory (doc/foo or whatever).

@heisen-li heisen-li changed the title [WIP] cargo clean -p package can clear the document of a single crates cargo clean -p package can clear the document of a single crates Aug 28, 2021
…into clean_p_1

* 'clean_p_1' of https://github.com/QiangHeisenberg/cargo: (32 commits)
  Show desc of well known subcommands (fmt, clippy) in cargo --list
  Fix test not to rely on `cargo` in PATH.
  Add hint for cargo metadata in environment section
  Fix panic with build-std of a proc-macro.
  Improve resolver message for validate_links
  Update tests to display dep-req info for dep-chain
  Display dep requirement info for cyclic dependencies
  Add dep requirement info for dep-chain display
  Fix typo in git-authentication.md
  Add some debug logging for `cargo fix`
  Add documentation about third-party registries.
  Temporarily disable extern-html-root-url test.
  Fix more typos, “an”→“a” and “an”→“and”
  Fix another typo “a”→“an”
  Fix typos “a”→“an”
  Better fmt for file body
  unset the FIX_ENV when executing the real rustc
  Fix test incorrectly validating CARGO_PKG_LICENSE_FILE.
  Move `tmp` test directory.
  Bump curl.
  ...
Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think also that this should probably handle the interaction between -p and --doc, such that it would only clean the doc directory for the given package. Currently it seems to ignore the -p flag.

src/cargo/core/compiler/layout.rs Outdated Show resolved Hide resolved
tests/testsuite/clean.rs Show resolved Hide resolved
src/cargo/ops/cargo_clean.rs Outdated Show resolved Hide resolved
@ehuss ehuss assigned ehuss and unassigned Eh2406 Sep 1, 2021
…n_p_1

* 'master' of https://github.com/rust-lang/cargo:
  Temporarily move test lockfile_constant_during_new to nightly.
  Make --force-warn support auto-detect.
  Stabilize 2021 edition
  Swap out some outdated repo urls in documentation
  Make windows CI happy
  print the full destination path when no track duplicates
  Document stabilization version
  Change `cargo fix --edition` to only fix edition lints.
  cancel examples
  fix doc
  doc man.page
  Stabilize patch-in-config
  Prefer config [patch] over manifest [patch]
  fix bug
  add fmt
  add --examples subcommand
  fmt
  Update doc.rs
  Add cargo doc --examples subcommand
  Make library created with `cargo new` clippy happy
Copy link
Contributor Author

@heisen-li heisen-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have finished modifying and testing the deleted files.

tests/testsuite/clean.rs Show resolved Hide resolved
rm_rf_glob(&layout.fingerprint().join(&pkg_dir), config)?;
// Clean target/doc/pkg.
rm_rf(&layout.doc().join(pkg.name()), config)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't using the correct name here. The directory names are the crate names of the targets. I'm uncertain about the best way to handle this, but specifically:

  • Non-workspace members can only have library targets documented, so that would be something like pkg.library().map(|t| t.crate_name()). I think it could be dangerous to blindly remove directories for other targets like "examples" because there could be name collisions (for example, if something has an example called "serde", it would delete the serde directory, which is probably not what you want).
  • For workspace members, it is possible to document binaries (and now examples due to Adding the cargo doc --examples subcommand #9808). So to clean a package, it would need to delete all documentable targets. Unfortunately this runs afoul with the "serde" example, too, so maybe just deleting libraries is sufficient? I'm still not entirely clear on the motivation for cleaning these, so I'm uncertain if it is important or not.

Copy link
Contributor Author

@heisen-li heisen-li Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be dangerous to blindly remove directories for other targets like "examples" because there could be name collisions (for example, if something has an example called "serde", it would delete the serde directory, which is probably not what you want).

I'm sorry, I don't quite understand your question. Could you explain it again?

For workspace members, it is possible to document binaries (and now examples due to Adding the cargo doc --examples subcommand #9808). So to clean a package, it would need to delete all documentable targets. Unfortunately this runs afoul with the "serde" example, too, so maybe just deleting libraries is sufficient? I'm still not entirely clear on the motivation for cleaning these, so I'm uncertain if it is important or not.

I'm sorry. I really didn't think of the question in advance. If the example name in the ../examples/ folder is the same as another crate name, this should be taken into account.

Also, I'm not sure I understood this correctly: Is the current question the same as you stated above (about the example name conflicts with the crate name)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, both points are related to one another.

The doc directories are based on a target's crate name. So, for example, if I have a package named "foo" with a "foo" library and a "foo-cli" binary, there would be two directories:

  • doc/foo
  • doc/foo_cli

To clean this properly, it would need to iterate over all documentable targets and get their crate names. Don't forget that the package name, the target name, and the crate name can all be different. That is, package "abc-rs" can have a library named "my-abc-lib" which has a crate name of "my_abc_lib".

However, I'd be concerned about indiscriminately iterating over a package's targets because of name collisions. Imagine my "foo" library had a dependency on "serde", and a dependency on another package named "bar". I run cargo doc and also get the documentation for doc/foo, doc/serde, and doc/bar. Now, imagine bar had an binary coincidentally named serde, if a user runs cargo clean -p bar, we don't want to blow away the serde directory just because bar had a serde binary.

So, I'm thinking, to avoid deleting the wrong things, one step would be to only look for library targets of non-workspace dependencies. That should limit the collateral damage.

Then, I'm wondering about limiting the collateral damage of workspace members, too. Perhaps including examples is overkill, and perhaps it should only delete library crate names? Or maybe libraries + binaries? Or maybe it is fine to collect all targets with documented() is true. That's where I'm uncertain, since I'm not really sure I understand the need for this functionality.

I think I would lean towards just deleting the entries from pkg.targets().iter().filter(|t| t.documented()) for workspace members, and just the lib for non-members.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some time to think, please give me some time. i am sorry, huss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, feel free to take your time, there's no rush. And let me know if you have any questions or if anything wasn't clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just ended my wedding leave. It was too late to reply.

However, I'd be concerned about indiscriminately iterating over a package's targets because of name collisions. Imagine my "foo" library had a dependency on "serde", and a dependency on another package named "bar". I run cargo doc and also get the documentation for doc/foo, doc/serde, and doc/bar. Now, imagine bar had an binary coincidentally named serde, if a user runs cargo clean -p bar, we don't want to blow away the serde directory just because bar had a serde binary.

This scenario was tested based on my understanding.

PS D:\Rust\temp\foo> cat .\Cargo.toml
...
[dependencies]
serde = "1.0.130"
bar = { path = "./bar"}
PS D:\Rust\temp\foo\bar> cat .\Cargo.toml
...
[[bin]]
name = "serde"
path = "./serde/src/main.rs"

The directory after cargo doc looks like this:

doc/bar
doc/foo
doc/serde

The binary files of serde and bar seem to have nothing to do with each other. In other words, the binary files under the bar directory will not form a document.

Therefore, running cargo clean -p bar only needs to delete doc/bar, and does not need to consider the conflict between the serde binary file under bar and the dependent serde of foo.
Is my understanding correct?

That's where I'm uncertain, since I'm not really sure I understand the need for this functionality.

My understanding of this function is:
If you execute cargo clean -p foo, you need to delete all doc/foo,doc/example_name,doc/dependence_name.
And these documents are in pkg.targets().iter().filter(|t| t.documented()).

I think I would lean towards just deleting the entries from pkg.targets().iter().filter(|t| t.documented()) for workspace members, and just the lib for non-members.

What does workspace member mean here?
Is this in Cargo.toml?

PS D:\Rust\temp\foo> cat .\Cargo.toml
...
[workspace]
members = ["member_1", "member_2", ...]

Frankly, I don't seem to fully understand workspace members and lib for non-members.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does workspace member mean here?

Essentially, yea, it is the members list. It's a little more complicated, since additional members can be inferred.

My understanding of this function is:
If you execute cargo clean -p foo, you need to delete all doc/foo,doc/example_name,doc/dependence_name.
And these documents are in pkg.targets().iter().filter(|t| t.documented()).

I think my question was: why does someone want to delete directories from the doc directory? What is the drawback of leaving them there? What is the use case for needing to delete them? Is it OK for it to only partially delete a package's data?

The binary files of serde and bar seem to have nothing to do with each other. In other words, the binary files under the bar directory will not form a document.

Therefore, running cargo clean -p bar only needs to delete doc/bar, and does not need to consider the conflict between the serde binary file under bar and the dependent serde of foo. Is my understanding correct?

I think what I'm trying to get at is that when you document a package, it documents the libraries and its binaries. If you have a package mypackage with binaries bar and baz, there will be three directories in the doc directory. To properly clean mypackage, it will need to iterate over all documentable targets and delete those directories. My concern is that could be too aggressive in some cases, as it could delete too much.

When cargo doc is run, it will document all documentable targets for workspace members, and all libraries for non-workspace members.

So my thought is that it would look for pkg.targets().iter().filter(|t| t.documented()) for workspace members, and pkg.targets().iter().find(|t| t.is_lib()) for non-members maybe?

@ehuss ehuss added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 21, 2021
@bors
Copy link
Contributor

bors commented Nov 17, 2021

☔ The latest upstream changes (presumably #10072) made this pull request unmergeable. Please resolve the merge conflicts.

@heisen-li heisen-li requested a review from ehuss December 1, 2021 09:32
src/cargo/ops/cargo_clean.rs Outdated Show resolved Hide resolved
@heisen-li heisen-li requested a review from ehuss December 25, 2021 06:13
@heisen-li
Copy link
Contributor Author

@ehuss Excuse me. I'm so sorry. Can you look at it again? I know this may be rude, but I wonder if I understand you correctly. To be honest, I'm a little fuzzy on this need now.

@heisen-li heisen-li closed this Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo clean -p does not clean documentation
5 participants