Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to keep redundant files #263

Closed

Conversation

xearl4
Copy link

@xearl4 xearl4 commented Jan 22, 2024

A long-standing pain point for go-spacemesh users is that it simply deletes postdata files that are deemed "redundant". For a go-spacemesh managed through smapp, the behaviour may be reasonable and comprehensible. For CLI users, however, this often means that a simple config mistake of setting the wrong num-units in the config file leads to weeks of postdata file initialisation being quickly wiped out.

With this change, the current default behaviour is preserved: redundant files are deleted. If the new option is enabled, an info entry is logged for redundant files, but they are not deleted.

This change prepares the stage for a dependent change in go-spacemesh.

Default behaviour is preserved: redundant files are deleted. If the new
option is enabled, an info entry is logged for redundant files, but they
are not deleted.

This is in preparation for a forthcoming go-spacemesh change using this
new option.
@schinzelh
Copy link
Contributor

Fixes spacemeshos/go-spacemesh#4594

@fasmat fasmat self-assigned this Jan 22, 2024
@fasmat fasmat self-requested a review January 22, 2024 23:02
Copy link
Member

@fasmat fasmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions! 🙂

postcli in post/cmd/postcli might also benefit from the new functional option of the Initializer, if you find some time to take a look at it.

You might want to consider adding the KeepRedundantFiles flag to InitOpts in post/config/config.go instead of using a dedicated functional option for it. This will probably make integration in go-spacemesh easier.

Setting the flag via the cmd line and/or config is a bit involved - I suggest you add the bool to PostSetupOpts in go-spacemesh/activation/post.go which is already passed to the service that uses the Initializer to generate the post data - PostSetupManager in go-spacemesh/activation/post.go. PostSetupOpts also has a method to convert it to config.InitOpts used by the Initializer which you can extend with the new flag.

If you want to expose the flag not only via the node-config.json but also via cmdline parameters you can take a look at go-spacemesh/cmd/root.go to see how other cmd line parameters are handled. 🙂

Comment on lines -298 to 312
if err := removeRedundantFiles(init.cfg, init.opts, init.logger); err != nil {
if err := checkRedundantFiles(init.cfg, init.opts, init.keepRedundantFiles, init.logger); err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing the bool to the function I'd prefer to instead not call removeRedundantFiles when it is true:

if !init.keepRedundantFiles {
	if err := removeRedundantFiles(init.cfg, init.opts, init.logger); err != nil {
		return err
	}
}

no need to iterate over all files telling the user what is all ignored when the user actively chose to ignore files.

Copy link
Author

@xearl4 xearl4 Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly fine as well.

But in the usage scenarios I envision, many users will set "keep" as a default everywhere, and then still find it helpful to see log messages where additional files are discovered, to assist them in keeping track of things. The log message helps in cases where one forgot to adjust the num-units after an increase; it also helps in cleaning up after a decrease: it reminds the user to think about a situation that was previously handled in a one-sided manner by the program by deleting files.

In fact, I originally had exactly the variant you proposed, with no log message at all :) I only added the log message because I thought that might make it easier to upstream. Having used it while resizing (mostly increasing) lots of postdata over the past few weeks, I have come to slightly prefer the log message.

All that said: either way is fine with me.

@xearl4
Copy link
Author

xearl4 commented Jan 23, 2024

postcli in post/cmd/postcli might also benefit from the new functional option of the Initializer, if you find some time to take a look at it.

Sure. Main question is: what do we want the default to be? Should postcli default to deleting, and keep files if you pass a CLI flag -- or should it keep redundant files by default, and only delete when passed, say, --cleanup?

@fasmat
Copy link
Member

fasmat commented Jan 23, 2024

postcli requires explicitly stating numUnits; a user specifying fewer units than they have on disk thereby probably intends to initialize the value they provided.

On the other hand having a -keepFiles flag makes less sense then having a -deleteFiles flag. So I guess the default should probably be to not delete 🤔

@pigmej
Copy link
Member

pigmej commented Feb 11, 2024

I agree with @fasmat here.

by default, it should not delete UNLESS specified.

postcli case is a bit specific because we support from file etc., so trying to guess what the user wanted is, IMO tricky; we should basically do what the user said us to do, so whatever is specified -> ok.

@fasmat
Copy link
Member

fasmat commented Feb 27, 2024

Hi @xearl4 @schinzelh,

we just merged a change that adds additional checks to postcli to prevent users from accidentally corrupting/deleting their files: #270 and released the changed version as v0.12.0 (pre-release for now).

With the upcoming v1.4.x version of go-spacemesh the post data directory can be set to read only after initialization to prevent accidental deletion / corruption of existing data and with the upcoming multi-smeshing release we plan to remove initialization from the node completely and only use postcli for creating PoST data.

I will therefore close this PR. If you feel additional changes are needed feel free to re-open and update your PR and re-start the discussion 🙂

@fasmat fasmat closed this Feb 27, 2024
@xearl4
Copy link
Author

xearl4 commented Feb 27, 2024

Sounds good, thanks!

@xearl4 xearl4 deleted the add-keep-redundant-files-option branch February 27, 2024 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants