Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to recursively GC paths #8417

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

balsoft
Copy link
Member

@balsoft balsoft commented May 30, 2023

Motivation

Refactor GCOptions to make the types reflect possible GC operations, add a way to delete a specific path recursively, and add a --recursive flag to nix store delete.

Context

Fixes #7239

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests - tests/nixos/*
  • documentation in the manual
  • documentation in the internal API docs
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or incompatible change: updated release notes

Priorities

Add 👍 to pull requests you find important.

@github-actions github-actions bot added new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store labels May 30, 2023
@balsoft balsoft force-pushed the gc-closure branch 3 times, most recently from 6782bb1 to 11d2199 Compare May 30, 2023 08:22
@edolstra
Copy link
Member

Can't this be done using nix-store --delete $(nix-store -qR <path>)?

@balsoft
Copy link
Member Author

balsoft commented May 30, 2023

Can't this be done using nix-store --delete $(nix-store -qR )?

AFAIU in that case the entire operation will fail if at least one path in the closure is still alive. This PR adds a way to remote all dead paths in a closure. I should clarify that in the documentation.

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go through the whole of it yet, but from a first pass: I like this very much. It doesn't seem to exactly do what it says though (se the inline comment in gc.cc). Having a solid set of testcases for it would help clarifying that

Comment on lines 711 to 745
int action;
bool ignoreLiveness;
StorePathSet pathsToDelete;
bool recursive {false};
action = readInt(from);
pathsToDelete = WorkerProto<StorePathSet>::read(*store, from);
from >> ignoreLiveness >> options.maxFreed;
// obsolete fields
readInt(from);
readInt(from);
readInt(from);
if (GET_PROTOCOL_MINOR(clientVersion) >= 31) {
from >> recursive;
};

switch (action) {
case 0:
options.action = GCReturn::Live; break;
case 1:
options.action = GCReturn::Dead; break;
case 2:
options.action = GCDelete { .ignoreLiveness = ignoreLiveness }; break;
case 3:
options.action = GCDelete { .pathsToDelete = GCPathsToDelete { .paths = pathsToDelete, .recursive = recursive }, .ignoreLiveness = ignoreLiveness, }; break;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be factored out in a WorkerProto<GCOptions>::read method or similar?

Comment on lines 492 to 506
std::visit(overloaded{[&](GCDelete del){
if (del.pathsToDelete.has_value() && del.ignoreLiveness) {
gcKeepOutputs = false;
gcKeepDerivations = false;
}
}, [](GCReturn arg){}}, options.action);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::visit(overloaded{[&](GCDelete del){
if (del.pathsToDelete.has_value() && del.ignoreLiveness) {
gcKeepOutputs = false;
gcKeepDerivations = false;
}
}, [](GCReturn arg){}}, options.action);
if (auto del = std::get_if<GCDelete>(&options.action)
&& del->pathsToDelete.has_value() && del->ignoreLiveness) {
gcKeepOutputs = false;
gcKeepDerivations = false;
}

(just a suggestion, but I find it easier to read since the second branch is empty. Also same in a couple places below)

Copy link
Member

@Ericson2314 Ericson2314 Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like std::visit for completeness checking. Especially with code that is overall quite tricky / high stakes.

Copy link
Member

@Ericson2314 Ericson2314 Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, I would put each function on its own "lines" so the "match arms" are easy to read as such.

Comment on lines 16 to 30
struct GCPathsToDelete {
StorePathSet paths;
bool recursive;
};

GCAction action{gcDeleteDead};
/* Delete either a given set of paths, or all dead paths */
struct GCDelete {
/* Delete this set, or all dead paths if it is std::nullopt */
std::optional<GCPathsToDelete> pathsToDelete;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) I'm not sure that this encoding is really optimal. Shoudln't we rather have pathsToDelete be a 3-way variant (<DeleteAll, DeletePath(StorePath), DeleteClosure(StorePath)>)? That's isomorphic but it feels more natural to me

Comment on lines 881 to 929
if (ret == GCReturn::Live) {
for (auto & i : alive)
results.paths.insert(printStorePath(i));
return;
}

printInfo("note: currently hard linking saves %.2f MiB",
((unsharedSize - actualSize - overhead) / (1024.0 * 1024.0)));
}
if (ret == GCReturn::Dead) {
for (auto & i : dead)
results.paths.insert(printStorePath(i));
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) maybe use a switch instead?

Comment on lines 791 to 803
std::visit(overloaded{
[&](GCDelete del){
if (del.pathsToDelete.has_value()) {
for (auto & i : del.pathsToDelete->paths) {
deleteReferrersClosure(i);
if (!del.pathsToDelete->recursive && !dead.count(i))
throw Error(
"Cannot delete path '%1%' since it is still alive. "
"To find out why, use: "
"nix-store --query --roots",
printStorePath(i));
}
} if (options.maxFreed > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this does what it should:

  • Even if recursive is set, it won't recurse into the (references) closure of the path. All it will do is not fail if the path can't be deleted
  • It seems (both from the code and from testing) that it will always trigger a full GC, even when a partial one (closure or single path) is requested

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems (both from the code and from testing) that it will always trigger a full GC, even when a partial one (closure or single path) is requested

Sorry, that was an off-by-one bug, fixed now.

Even if recursive is set, it won't recurse into the (references) closure of the path. All it will do is not fail if the path can't be deleted

Whoops! I lost the bit of code I wrote to GC recursively...

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go through the whole of it yet, but from a first pass: I like this very much. It doesn't seem to exactly do what it says though (se the inline comment in gc.cc). Having a solid set of testcases for it would help clarifying that

@balsoft balsoft requested a review from thufschmitt June 6, 2023 13:37
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-49/29047/1


GCAction action{gcDeleteDead};
/* Delete either a given set of paths, or all dead paths */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please continue to use /** style comments so it renders in the internal API docs.

@thufschmitt
Copy link
Member

Still didn't manage to review this (sorry about that, I've blocked an hour tomorrow afternoon for it), but one thing I can say any way is that it will require tests and documentation

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-50/29793/1

@blaggacao
Copy link
Contributor

blaggacao commented Jun 29, 2023

Just to reassure myself: this covers the use case of release version pinning while using the nix store as backend, right?

(By sort of inverting the logic)

@domenkozar
Copy link
Member

Thanks a lot for doing this!

@domenkozar
Copy link
Member

@balsoft any chance to rebase and squash?

@balsoft
Copy link
Member Author

balsoft commented Jul 27, 2023

@domenkozar thanks, done

@balsoft balsoft force-pushed the gc-closure branch 3 times, most recently from d30ea9e to a1886ea Compare July 27, 2023 09:58
@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Jul 27, 2023
Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments, but my main issue right now is that it doesn't seem to work 🙃

I've pushed an extended version of the test which shows the issue, let me know if I misunderstood something about how it's supposed to work

@@ -375,7 +377,7 @@ void LocalStore::findRuntimeRoots(Roots & roots, bool censor)
auto storePathRegex = std::regex(quoteRegexChars(storeDir) + R"(/[0-9a-z]+[0-9a-zA-Z\+\-\._\?=]*)");
while (errno = 0, ent = readdir(procDir.get())) {
checkInterrupt();
if (std::regex_match(ent->d_name, digitsRegex)) {
if (std::regex_match(ent->d_name, digitsRegex) && strcmp(pids.c_str(), ent->d_name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this?

"nix-store --query --roots",
printStorePath(i));
}
} else if (options.maxFreed > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that maxFreed will be ignored unless we gc the whole store, right? Worth opening a follow-up issue about it then because that's probably not what we want if we use --recursive

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-08-04-nix-team-meeting-minutes-77/31487/1

@nyabinary
Copy link

Whats the status for this?

@domenkozar
Copy link
Member

@balsoft this PR doesn't work yet, do you plan to work on it?

@balsoft
Copy link
Member Author

balsoft commented Nov 15, 2023

Yep. I'm planning to finish the work at some point, definitely.

@domenkozar
Copy link
Member

@balsoft still the plan?

@balsoft
Copy link
Member Author

balsoft commented Jan 11, 2024

Yep! Currently tied up on #9287 , once that's done-ish I'll get back to other PRs including this one!

@thufschmitt thufschmitt removed their assignment Feb 13, 2024
@thufschmitt thufschmitt added the idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. label Feb 13, 2024
Théophane Hufschmitt added 3 commits March 1, 2024 06:44
Make `nix store gc` accept installable arguments. If these are provided,
then the gc will only happen within the closure of these installables
and ignore any path outside, even if it's dead.

`nix store gc foo` is morally equivalent to

```bash
for validPath in $(nix path-info --recursive foo); do
  nix store delete "$validPath" || true
done
```
Fallback to a full GC if the daemon doesn't support partial ones
@thufschmitt thufschmitt force-pushed the gc-closure branch 2 times, most recently from a3b08e1 to 8db988f Compare March 1, 2024 16:45
@thufschmitt
Copy link
Member

I've rewritten it a bit to make it less invasive (kept the original changes under https://github.com/tweag/nix/commits/gc-closure_ because the cleanup of GCOptions is rather nice and it would be good to come back to it later).

I've also changed the interface to use nix store gc <installable> instead as that felt more coherent

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-55/40996/1

options.action != GCOptions::gcDeleteSpecific &&
! options.pathsToDelete.empty() &&
GET_PROTOCOL_MINOR(conn->daemonVersion) < 38) {
warn("Your daemon version is too old to support garbage collecting a closure, falling back to a full gc");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be a no-op, doing a GC when I tried to delete just one path can be surprising, and lead to unexpected data loss, especially given that this command is likely to be called in some weird circumstances.

@domenkozar
Copy link
Member

I've found a bug:

❯ nix store gc /nix/store/w53sjb6pmzdqfcflh92rlmjfzz6xbc33-devenv-shell-env
error:
       error: unexpected argument '/nix/store/w53sjb6pmzdqfcflh92rlmjfzz6xbc33-devenv-shell-env'
Try '/nix/store/a0i9g29xgmk42gjjglzd6k0kzyjkay3n-nix-2.21-devenv/bin/nix --help' for more information.

Where '/nix/store/w53sjb6pmzdqfcflh92rlmjfzz6xbc33-devenv-shell-env is a derivation for nix-shell.

@edolstra
Copy link
Member

I've also changed the interface to use nix store gc instead as that felt more coherent

I would prefer a different command for that, e.g. nix store delete-path .... Apart from being clearer (you're not doing a GC but deleting a specific path), overloading a command to either delete everything or some paths has the problem that shell code like

nix store gc "${paths[@]}"

will do a full GC if the paths array is empty, which is probably not intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Garbage collect a closure
8 participants