-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submitting batch job fails randomly with broken paths #260
Comments
Hi, could you try out the slurm-20.11.8 branch? (https://github.com/PySlurm/pyslurm/tree/slurm-20.11.8) So you submitted the jobs from the directory /correct/work/dir, right? Does |
The strings are also broken in
|
I switched to branch
|
Looking at the byte data you can see the reoccurring byte pattern I suspect this issue will be localised to the |
Hi, yeah culprit is definitely The encoding step itself should be fine, however it has likely to do with the lifetime of the char* pointer for
This itself is fine, however this code is in a different function than the one actually submitting the job. By the time the function (fill_job_desc_from_opts) which contains this code is done, You won't see this behaviour though when you explicitly specify the work_dir - the python object will live long enough since it is in the Anyway, in this case a quick fix in the code would be to modify the incoming The long-term fix would be to restructure the job API in a way that things like these can't happen anymore (working on it) The |
The problem still persist if
|
Oh,
|
Mh weird, I can replicate the erroneous symbols if I don't supply a import pyslurm; psj = pyslurm.job() ; jid = psj.submit_batch_job({'wrap': 'sleep 5', 'work_dir': '/my/work/dir', 'get_user_env_time': -1}) ; job = psj.find_id(jid)[0] ; print(jid, job['job_state'], job['work_dir']) mh - wondering why its not working for you with that. (I'm on 22.05, though it is still the same code in pyslurm) |
Details
Issue
Submitting jobs (both via
script
orwrap
) fails randomly. An immediate indicator is that thework_dir
(and other paths likestd_out
andstd_err
) are broken strings on those cases:For a failing job:
Any idea what is going wrong?
The text was updated successfully, but these errors were encountered: