Workflows failing for DataSets with multiple input files #32

captainceramic · 2015-05-19T06:50:27Z

@DamienIrving has a workflow that is failing whenever it is run with multiple input models.

captainceramic · 2015-05-19T06:51:32Z

@DamienIrving - would you be able to check the failing workflow into cwsl-workflows repo? Maybe we could open a development branch - it would be good to see your exact workflow to repeat the bug.

captainceramic · 2015-05-19T06:53:35Z

@DamienIrving - actually, don't worry. I can replicate this. Sorry about this - I really thought this was covered by a unit test. I'm on it - it is a recent regression from something I did.

DamienIrving · 2015-05-19T06:57:08Z

@captainceramic Ok. If you change your mind let me know - I'm happy to push my workflows to cwsl-workflows if need be.

This included - Checking for valid combinations of attributes instead of hash values - Using the ArgumentCreator as an iterator (__iter__) - Removing confusing logger debugging statements - Continuing iteration instead of returning None when no matching inputs are found for an output file.

captainceramic · 2015-05-19T08:30:19Z

@DamienIrving - I've pushed some code that should fix the simplest version of this bug (the one where models and institutes are getting mixed up). Can you have a go and see if it fixes it in your workflow?

The second, more complex issue is around "mapping" constraints and is issue #14 - Would you be able to post your desired workflow (the one with the arithmetic comparisons of two datasets) under that issue? I think it could be a good test case for it.

If you can get the multiple inputs going I'll close this issue.

DamienIrving · 2015-05-29T07:20:29Z

As noted in #35, I think I just found a bug with the handling of multiple input files in ensemble operations (i.e. where you want to pass all the files at once for constraints that your are overwriting). The bug happens when I run a workflow where in_dataset has more than one experiment (e.g. rcp45 and rcp85). VisTrails basically just freezes at the Ensemble Aggregation step and eventually stops the incomplete workflow with no error message or anything (i.e. the VisTrails application freezes and I have to kill it). It works fine if there is only one experiment, but fails as soon as there is more than one.

captainceramic · 2015-06-01T03:29:02Z

I can reproduce this - looking at it now.

captainceramic · 2015-06-01T07:13:52Z

I think this is a combination of two things -

To implement some of the more complex combinations of keyword/positional/added constraint arguments I got a bit loose in trimming out unused Constraint values. This meant that for big ensembles we got enough possible combinations to slow things down. I have pushed some code to tighten this up.
Doing one of these big ensembles means moving around a lot of data, and I think that sometimes it just takes a really long time to stage in what is basically the entire tas or tos for rcp85 and rcp45 from the CMIP5 archive, then copy the whole thing, subset it, regrid it etc. Implementing some way to avoid overwriting existing files could be a big advantage here.

captainceramic added the bug label May 19, 2015

captainceramic self-assigned this May 19, 2015

captainceramic removed their assignment Jun 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflows failing for DataSets with multiple input files #32

Workflows failing for DataSets with multiple input files #32

captainceramic commented May 19, 2015

captainceramic commented May 19, 2015

captainceramic commented May 19, 2015

DamienIrving commented May 19, 2015

captainceramic commented May 19, 2015

DamienIrving commented May 29, 2015

captainceramic commented Jun 1, 2015

captainceramic commented Jun 1, 2015

Workflows failing for DataSets with multiple input files #32

Workflows failing for DataSets with multiple input files #32

Comments

captainceramic commented May 19, 2015

captainceramic commented May 19, 2015

captainceramic commented May 19, 2015

DamienIrving commented May 19, 2015

captainceramic commented May 19, 2015

DamienIrving commented May 29, 2015

captainceramic commented Jun 1, 2015

captainceramic commented Jun 1, 2015