Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError - Please reduce your request rate #650

Open
norlandrhagen opened this issue Dec 4, 2024 · 5 comments
Open

RuntimeError - Please reduce your request rate #650

norlandrhagen opened this issue Dec 4, 2024 · 5 comments

Comments

@norlandrhagen
Copy link

Hi there 👋

Deep within pangeo-forge-recipes we're seeing this error crop up when writing to a gcs bucket. It seem to happen on multiple gcsfs version (2024.10.0, 2024.09.0, etc..)

RuntimeError: gcsfs.retry.HttpError: The object <path>/chirps-global-daily.zarr/time/0 exceeded the rate limit for object mutation operations (create, update, and delete). Please reduce your request rate. See https://cloud.google.com/storage/docs/gcs429.,

I don't have an MRE, but was wondering if there are any gcsfs knobs for controlling the rate limit when writing Zarr chunks?

Thanks in advance!

cc @jbusecke

@martindurant
Copy link
Member

gcsfs.retry has the code to decide what to do with various error states. Obviously, this one should be caught in the retryable errors list, which will result in retries with exponential backoff, just what you need.

There's no general way to coordinate the number of requests across processes, and it's the total rate on the bucket that counts. All you can do is limit the number of concurrent requests per batch, see fsspec.asyn._get_batch_size for the relevant config values.

@norlandrhagen
Copy link
Author

Thanks for the response @martindurant!

I haven't touched the gcsfs/fsspec internals, so please excuse some maybe obvious questions!

. .. this one should be caught in the retryable errors list, which will result in retries with exponential backoff, just what you need.

With the error: line 117, in validate_response raise HttpError(error) RuntimeError: gcsfs.retry.HttpError: T would you suggested adding the HttpError to the RETRIABLE_EXCEPTIONS list?

RETRIABLE_EXCEPTIONS = (

There's no general way to coordinate the number of requests across processes, and it's the total rate on the bucket that counts. All you can do is limit the number of concurrent requests per batch, see fsspec.asyn._get_batch_size for the relevant config values.

Do you have any tips for setting this _get_batch_size for gcsfs/fsspec? It would be nice to pass it into the pangeo-forge-recipes FSSpecTarget fsspec_kwargs

import fsspec 
import gcsfs 

fsspec.config.conf = {'gather_batch_size':17}

fs = fsspec.filesystem('gs')
fs.batch_size 

@martindurant
Copy link
Member

would you suggested adding the HttpError to the RETRIABLE_EXCEPTIONS list

No, it's far too general - but testing the case of HttpError to see if it's a "slow down" request. I see that code 429 is listed in the possible status codes to retry, so more specifics on what the server actually sent would be good.

@martindurant
Copy link
Member

fsspec.config.conf['gather_batch_size'] = 17

is probably how you want to phrase it, so that copies of the dict are updated too. You can also put this in config files (any "~/.config/fsspec/*.json").

@martindurant
Copy link
Member

more specifics on what the server actually sent would be good.

Did you see this again, is it possible to establish what the HTTP error looked like, to make sure we retry it correctly in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants