[WIP] Feature: Send stdout and stderr to Galaxy while job is running #345
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hey,
So this is related to this pr in the main Galaxy repo: galaxyproject/galaxy#16975.
The changes facilitate sending both stdout and stderr to Galaxy while a job is running for the purposes of displaying said stdout inside of Galaxy while the job is running.
The main changes include adding two parameters to the
app.yml
config:send_stdout_update
which is a boolean, andstdout_update_interval
which is a float. The first controls whether Pulsar will send stdout/stderr or not, the second is the interval (in seconds) between updates.The way that the files is sent is through the files endpoint in Galaxy. In order to not send the entire file each time, in a dict I keep track of the position in the stdout/stderr file that the last update read up to. I then only send the new part of the stdout file.
After the job is finished, I send any stdout/stderr left that has not been sent. In the final status message send over the broker, instead of including stdout there, I set the stdout and stderr fields there to None, so that it doesn't send the whole file again. In Galaxy, there are a couple of changes in the Pulsar job runner that check if those fields are None, and if so load the stdout from the job directory there.
Like with the other pr, this was done mostly with the intent of not messing around with existing functionality, which is why we didn't want to use messages to send it. Also, like the other pr, any feedback is welcome.