Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lustre stripe error noise when resuming from checkpoint #10

Open
davebiffuk opened this issue Jan 5, 2015 · 2 comments
Open

Lustre stripe error noise when resuming from checkpoint #10

davebiffuk opened this issue Jan 5, 2015 · 2 comments

Comments

@davebiffuk
Copy link

If a pcp job with -l (preserve Lustre striping) is resumed from a checkpoint, there are a lot of error messages of the form:

error on ioctl 0x4008669a for '/lustre/file/name' (10): stripe already set

which obscure other problems. It would be nice if these messages didn't appear.

(I understand that these are printed directly by the Lustre library)

@guycoates
Copy link
Collaborator

Provisional fix code is in 5df8852. That silences the messages. I'm not sure of the best way to pass lustre errors up to the calling code to allow the caller to do something useful...

@davebiffuk
Copy link
Author

Thanks, that looks safe for kernel 2.6.11+ because pipe size is 64K from then on, and max path length is 4K. I guess there's an edge case for the combination of earlier kernels and path names approaching 4K long, the stderr content would be bigger than the pipe - I don't know how the library would react to that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants