-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy_to_local in OpenWithXarray not working on Dataflow #722
Comments
I think you also have to set pangeo-forge-recipes/pangeo_forge_recipes/openers.py Lines 242 to 246 in fe03650
|
Just opened pangeo-forge/pangeo-forge-runner#183 to test if I can set larger HDD on the dataflow workers. |
Ok so that seems to generally work, but it might be useful to somehow check if the current worker has any storage attached? Not sure if this is possible in general, but that would certainly increase the usability of that feature. |
Should this be an error, rather than a warning? I did not see this anywhere in the logs on dataflow. I guess this depends on where this is run. |
If the worker has no storage attached, this would fail on |
So ok I now ran another test. For the above test case, I am doing the following:
And I am getting these sort of error traces: I am fairly confident I can exclude workers OOMing: The memory useage is very low, and each workers memory could hold the entire dataset (all files) in memory |
As suggested by @rabernat in the meeting earlier today, I ran a test of my (M)RE (see #715 and ) with
copy_to_local
set to True.This failed running on Google dataflow with several errors similar to this (from the dataflow workflow logs of this job):
So this is not a fix for the issue in #715 yet.
I am not entirely sure how I should go about debugging this further. Any suggestions welcome.
The text was updated successfully, but these errors were encountered: