Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: retry and retry_async support streaming rpcs #495

Merged
merged 220 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from 190 commits
Commits
Show all changes
220 commits
Select commit Hold shift + click to select a range
953106a
got retryable generators partially working
daniel-sanche Feb 8, 2023
89aeb75
added retrun statement
daniel-sanche Feb 8, 2023
27feb80
refactoring
daniel-sanche Feb 8, 2023
0dffa6d
work for now deadline
daniel-sanche Feb 8, 2023
b330c3b
improved synchronous generator retry
daniel-sanche Feb 8, 2023
67ceaa2
handle closing and returns
daniel-sanche Feb 10, 2023
ee2647a
got test to pass
daniel-sanche Feb 10, 2023
5a5396c
restructured test
daniel-sanche Feb 10, 2023
7afa76b
added tests
daniel-sanche Feb 10, 2023
2d91ade
refactoring and comments in retry code
daniel-sanche Feb 10, 2023
0cd384e
fixed helper; added is_generator flag
daniel-sanche Feb 11, 2023
f72bbec
got first test working
daniel-sanche Feb 11, 2023
88eed5c
remove extra await in front of async generator
daniel-sanche Feb 11, 2023
91f9cc4
implemented async generator retry test
daniel-sanche Feb 11, 2023
c3eb997
fixed is_generator
daniel-sanche Feb 11, 2023
f6c6201
added tests for aclose and athrow
daniel-sanche Feb 11, 2023
57b0ee3
simplified close; don't support throws
daniel-sanche Feb 11, 2023
e814ce7
added tests
daniel-sanche Feb 11, 2023
0ffb03f
have test that throw should retry
daniel-sanche Feb 11, 2023
a8024f3
improved aclose and athrow
daniel-sanche Feb 11, 2023
c76f641
added comments
daniel-sanche Feb 11, 2023
ee631e3
close synchronous generator
daniel-sanche Feb 11, 2023
70eb78c
refactor async file
daniel-sanche Feb 11, 2023
42ee132
ran blacken
daniel-sanche Feb 11, 2023
102d83b
improved send test
daniel-sanche Feb 11, 2023
f029dbd
improved comments
daniel-sanche Feb 11, 2023
c83c62a
got send working
daniel-sanche Feb 11, 2023
185826c
tested deadline handling
daniel-sanche Feb 11, 2023
c5f7bbe
changed timeout to only count time awaiting or sleeping
daniel-sanche Feb 13, 2023
4242036
improved comments
daniel-sanche Feb 14, 2023
9c4799c
added test for cancellation
daniel-sanche Feb 14, 2023
0bd6cab
improved comments
daniel-sanche Feb 14, 2023
67aeeaf
on_error can yield into the generator stream
daniel-sanche Feb 14, 2023
985b13a
Merge branch 'main' into retry_generators
daniel-sanche Apr 3, 2023
0ea8297
added filter_func to retryable generator
daniel-sanche Apr 4, 2023
b952652
fixed error in time budget calculation
daniel-sanche Apr 6, 2023
6cb3e2d
added from field to raised timeout exception
daniel-sanche Apr 6, 2023
99da116
removed filter_fn
daniel-sanche Apr 6, 2023
7f862d0
ran blacken
daniel-sanche Apr 6, 2023
04a4a69
removed generator auto-detection
daniel-sanche Apr 7, 2023
d20cf08
fixed tests and lint
daniel-sanche Apr 7, 2023
183c221
changed comments
daniel-sanche Apr 7, 2023
d2217e4
fixed 3.11 failed test
daniel-sanche Apr 7, 2023
d4a9d30
added comments
daniel-sanche Apr 7, 2023
06d45cc
made streaming retries into a custom generator object
daniel-sanche Apr 13, 2023
de41a14
added tests for iterators
daniel-sanche Apr 13, 2023
dcb3766
added test for non-awaitable target
daniel-sanche Apr 13, 2023
dd368e4
changed is_generator to is_stream
daniel-sanche Apr 13, 2023
452b9bb
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Apr 13, 2023
6879418
changed docstrings
daniel-sanche Apr 13, 2023
847509f
removed back-ticks
daniel-sanche Apr 13, 2023
b5e3796
removed outdated comment
daniel-sanche Apr 13, 2023
7a7d9ac
changed comments
daniel-sanche Apr 14, 2023
6619895
moved streaming retries to new files
daniel-sanche Apr 14, 2023
27fc930
reverted some style changes
daniel-sanche Apr 14, 2023
d6a23ea
changed comments
daniel-sanche Apr 14, 2023
90ef834
added comments
daniel-sanche Apr 14, 2023
6201db6
refactoring and commenting
daniel-sanche Apr 14, 2023
61ce3a7
blacken/mypy fixes
daniel-sanche Apr 14, 2023
69149a1
fixed issue with py37
daniel-sanche Apr 14, 2023
d63871e
added tests for bad sleep generators
daniel-sanche Apr 14, 2023
773e033
improved test_retry coverage
daniel-sanche Apr 14, 2023
d1def5d
improved async test coverage
daniel-sanche Apr 14, 2023
cbaaa1d
added test for calling next on exhausted generator
daniel-sanche Apr 14, 2023
21a863f
fixed lint issue
daniel-sanche Apr 14, 2023
878ddfb
changed docstring
daniel-sanche Apr 14, 2023
7b0a600
changed docstrings
daniel-sanche Apr 14, 2023
0188228
updated comments
daniel-sanche Apr 14, 2023
902a4ab
updated comments
daniel-sanche Apr 14, 2023
74f3f3e
fixed send and asend retry logic
daniel-sanche Apr 14, 2023
e506aad
update test error string
daniel-sanche Apr 19, 2023
5baa2aa
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Apr 19, 2023
5c3805d
improved type hinting
daniel-sanche Apr 19, 2023
265d998
improved test docs
daniel-sanche Apr 19, 2023
0423ebe
fixed mypy issues
daniel-sanche Apr 20, 2023
c4049f5
Merge branch 'main' into retry_generators
daniel-sanche Apr 21, 2023
acd6546
remove wait_for in async streaming for perf reasons
daniel-sanche May 8, 2023
b1ad4b3
fixed style issues
daniel-sanche May 8, 2023
8dcf67c
fixed callable type annotation
daniel-sanche May 10, 2023
6104c59
change time calculations
daniel-sanche May 12, 2023
43d0913
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] May 12, 2023
9ba7676
simplified retry_streaming_async to use wall time instead of cpu time
daniel-sanche May 19, 2023
14c195c
Merge branch 'main' into retry_generators
daniel-sanche Jun 16, 2023
de7b51a
removed extra CancelledError handling
daniel-sanche Jun 17, 2023
4cdee6b
improved docstrings
daniel-sanche Jun 20, 2023
a526d65
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jun 20, 2023
ee2bbdd
Merge branch 'main' into retry_generators
daniel-sanche Jul 17, 2023
5f82355
swapped out utcnow for more performant time.monotonic
daniel-sanche Jul 28, 2023
9900c40
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jul 28, 2023
2c2dcbe
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jul 28, 2023
3340399
Merge branch 'retry_generators' of https://github.com/googleapis/pyth…
gcf-owl-bot[bot] Jul 28, 2023
de07714
Merge branch 'main' into retry_generators
parthea Aug 7, 2023
67068ac
don't check timeout on each yield by default
daniel-sanche Aug 14, 2023
54325bc
added exception building logic
daniel-sanche Aug 15, 2023
bafa18b
added type hint to check_timeout_on_yield
daniel-sanche Aug 15, 2023
2ae2a32
simplified ensure_tareget; fixed mypy issues
daniel-sanche Aug 15, 2023
9cadd63
don't check timeout on each yield by default
daniel-sanche Aug 14, 2023
c9ef1d5
added exception building logic
daniel-sanche Aug 15, 2023
41c7868
added type hint to check_timeout_on_yield
daniel-sanche Aug 15, 2023
30fccb9
simplified ensure_tareget; fixed mypy issues
daniel-sanche Aug 15, 2023
a2b0e6c
remove iteration helper
daniel-sanche Aug 15, 2023
4aa1ab4
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Aug 15, 2023
8349424
added test coverage for send/asend
daniel-sanche Aug 15, 2023
ece5cf8
Merge branch 'retry_generators_2' into retry_generators
daniel-sanche Aug 15, 2023
5ddda24
added test for closing new generator
daniel-sanche Aug 15, 2023
9e3ea92
improved test decorators
daniel-sanche Aug 15, 2023
3b06b3a
swapped out generator object with generator function
daniel-sanche Aug 15, 2023
8bb6b0c
support iterators, along with generators
daniel-sanche Aug 15, 2023
37c64a0
got tests passing with new structure
daniel-sanche Aug 15, 2023
cee0028
replaces sync streaming retries object with generator function
daniel-sanche Aug 15, 2023
3a7e5fa
removed timeout on yield functionality
daniel-sanche Aug 15, 2023
ba6dc9f
fixed comments
daniel-sanche Aug 15, 2023
0500b8b
fixed mypy issues
daniel-sanche Aug 15, 2023
1ccadb1
fixed issue with py310
daniel-sanche Aug 15, 2023
c312262
renamed streaming retry function
daniel-sanche Aug 15, 2023
1fe57e0
removed unneeded functions
daniel-sanche Aug 15, 2023
4f09f29
simplified some test functions
daniel-sanche Aug 15, 2023
06824b9
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Aug 15, 2023
343157b
removed unneeded test variable
daniel-sanche Aug 15, 2023
93f82cc
improved documentation
daniel-sanche Aug 16, 2023
0915ca0
Merge branch 'main' into retry_generators
parthea Sep 1, 2023
61e5ab5
fixed type hinting issues
daniel-sanche Sep 1, 2023
51c125b
fixed undefined name issues
daniel-sanche Sep 1, 2023
02604bc
fixed lint issues
daniel-sanche Sep 1, 2023
6269db2
update comment
daniel-sanche Sep 1, 2023
0dcd0de
fix typo
daniel-sanche Sep 1, 2023
54e9c81
Update google/api_core/retry_streaming.py
daniel-sanche Sep 1, 2023
2342910
added comment to on_error
daniel-sanche Sep 1, 2023
eada0d7
fixed indentation
daniel-sanche Sep 1, 2023
ae2bf37
improved sample
daniel-sanche Sep 1, 2023
c8a4f26
improved default exception factory
daniel-sanche Sep 1, 2023
2840b9f
added pylint disable line
daniel-sanche Sep 1, 2023
82274a3
cleaned up async retry wrapping
daniel-sanche Sep 1, 2023
1594a17
improved sample
daniel-sanche Sep 1, 2023
9b0ddb0
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Sep 1, 2023
8985127
remove extra generator close line
daniel-sanche Sep 1, 2023
60b20ab
added missing test
daniel-sanche Sep 1, 2023
237ca3d
AsyncRetry adds a coroutine in front of async streams
daniel-sanche Sep 12, 2023
a46c0f7
improved type checking
daniel-sanche Sep 12, 2023
93727b7
Merge branch 'main' into retry_generators
daniel-sanche Sep 12, 2023
796ae52
fixed typing issues
daniel-sanche Sep 12, 2023
0688ffe
moved docstrings
daniel-sanche Sep 21, 2023
da048ab
use enum in exception builder
daniel-sanche Sep 21, 2023
80e5eb0
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Sep 21, 2023
562079b
fixed lint and docs issues
daniel-sanche Sep 21, 2023
a0fecc5
Merge branch 'main' into retry_generators
daniel-sanche Oct 3, 2023
8cc6ea9
Update tests/unit/test_retry.py
daniel-sanche Oct 6, 2023
e7a5cd4
fixed comment line break
daniel-sanche Oct 6, 2023
02c12cc
use kwargs map
daniel-sanche Oct 6, 2023
03b1608
fixed on_error docstrings
daniel-sanche Oct 6, 2023
b05b11f
renamed example lists
daniel-sanche Oct 6, 2023
0b5d3a2
removed ignore_sent
daniel-sanche Oct 6, 2023
03f2af5
fixed lint issues
daniel-sanche Oct 6, 2023
5fee888
fixed generator mock and added comments
daniel-sanche Oct 6, 2023
239ed7d
Merge branch 'main' into retry_generators
daniel-sanche Oct 6, 2023
94eb0f5
Merge branch 'main' into retry_generators
daniel-sanche Oct 17, 2023
7d1e246
Merge branch 'main' into retry_generators
parthea Nov 8, 2023
b0faa2d
Apply suggestions from code review
daniel-sanche Nov 9, 2023
6c44298
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Nov 9, 2023
51df672
Update google/api_core/retry.py
daniel-sanche Nov 9, 2023
e207376
removed unneeded comments
daniel-sanche Nov 9, 2023
39716a7
improved comments
daniel-sanche Nov 9, 2023
2bbf33f
simplified generator detection
daniel-sanche Nov 9, 2023
3b03bfa
renamed variables
daniel-sanche Nov 9, 2023
e63701d
improved comments
daniel-sanche Nov 9, 2023
c101ea6
renamed variable
daniel-sanche Nov 9, 2023
3642d74
fixed tests
daniel-sanche Nov 9, 2023
34cfa08
improved comments
daniel-sanche Nov 9, 2023
583181d
Merge branch 'main' into retry_generators
daniel-sanche Nov 17, 2023
b311b87
fixed retry factory functionality
daniel-sanche Nov 18, 2023
19a998d
created new objects for streaming retry config
daniel-sanche Nov 20, 2023
5637e88
added typing to base retry
daniel-sanche Nov 20, 2023
c4be5f2
share base retry logic
daniel-sanche Nov 20, 2023
4d9e762
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Nov 20, 2023
2e9e84b
lint and mypy cleanup
daniel-sanche Nov 20, 2023
d183a7e
removed unneeded changes
daniel-sanche Nov 21, 2023
e2d9c9c
pass in initial args and kwargs to retry_target_stream
daniel-sanche Nov 21, 2023
4543106
uncommented functools.wraps
daniel-sanche Nov 21, 2023
d791aad
Merge branch 'main' into retry_generators
daniel-sanche Nov 21, 2023
638cc68
change enum encoding
daniel-sanche Nov 30, 2023
f7b1e14
moved base retry into own file
daniel-sanche Nov 30, 2023
07db4c2
restructured files
daniel-sanche Nov 30, 2023
d448a52
expose other retry target functions in retry __init__
daniel-sanche Nov 30, 2023
781426a
share a logger
daniel-sanche Nov 30, 2023
4a05404
extracted shared error handling logic
daniel-sanche Dec 1, 2023
b221c8d
added type hints
daniel-sanche Dec 1, 2023
b5b4534
removed costly awaitable check
daniel-sanche Dec 1, 2023
0f1145d
revised docstring
daniel-sanche Dec 1, 2023
8408512
added exception_factory docstrings
daniel-sanche Dec 1, 2023
aa69c56
Revert "removed costly awaitable check"
daniel-sanche Dec 1, 2023
d1ac29d
renamed variable
daniel-sanche Dec 5, 2023
3ab88fc
update docstring
daniel-sanche Dec 5, 2023
382d0e2
add punctuation
daniel-sanche Dec 5, 2023
4258823
punctuation
daniel-sanche Dec 5, 2023
1bc9731
update docstrings
daniel-sanche Dec 5, 2023
aafe057
changed deadline to timeout
daniel-sanche Dec 5, 2023
8095229
updated deadlien to timeout in docstrings
daniel-sanche Dec 5, 2023
de9f518
update docstring
daniel-sanche Dec 5, 2023
7864667
update test comment
daniel-sanche Dec 5, 2023
4c24322
update docstrings
daniel-sanche Dec 5, 2023
7855513
removed unneeded comments
daniel-sanche Dec 5, 2023
f4bfb02
improved docstrings
daniel-sanche Dec 5, 2023
a88cf6f
use timeout in tests
daniel-sanche Dec 5, 2023
b5c62e1
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Dec 5, 2023
852f4f8
moved test to proper place
daniel-sanche Dec 5, 2023
cd8323e
added test comments; fixed asserts
daniel-sanche Dec 5, 2023
ace61eb
use _build_retry_error as default param
daniel-sanche Dec 5, 2023
1bbd1f0
ran blacken
daniel-sanche Dec 5, 2023
35cc00a
added comment to clarify timeouts
daniel-sanche Dec 5, 2023
89abfa4
removed timeout vs deadline explainer from retry_streaming
daniel-sanche Dec 5, 2023
74ab817
remove duplicated test
daniel-sanche Dec 8, 2023
85b3e02
fixed variable name
daniel-sanche Dec 8, 2023
6dbe17d
made build_retry_error public
daniel-sanche Dec 8, 2023
71e5888
changed docstring
daniel-sanche Dec 8, 2023
cbae3d3
import extra helper in retry_unary_async
daniel-sanche Dec 11, 2023
61198b8
Merge branch 'main' into retry_generators
vchudnov-g Dec 11, 2023
acf9752
fix: address backwards compatibility warnings failing presubmits
vchudnov-g Dec 12, 2023
7cf9fbf
fix: address mypy errors
vchudnov-g Dec 12, 2023
f62439a
fix: address coverage and lint issues failing presubmits
vchudnov-g Dec 12, 2023
b7abeca
chore: simplify resolution of backaward-compatibility warnings
vchudnov-g Dec 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions google/api_core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,6 @@
from google.api_core import version as api_core_version

__version__ = api_core_version.__version__

# for backwards compatibility, expose async unary retries as google.api_core.retry_async
from .retry import retry_unary_async as retry_async # noqa: F401
45 changes: 45 additions & 0 deletions google/api_core/retry/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2017 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Retry implementation for Google API client libraries."""

from .retry_base import exponential_sleep_generator
from .retry_base import if_exception_type
from .retry_base import if_transient_error
from .retry_base import _build_retry_error
from .retry_base import RetryFailureReason
from .retry_unary import Retry
from .retry_unary import retry_target
from .retry_unary_async import AsyncRetry
from .retry_unary_async import retry_target as retry_target_async
from .retry_streaming import StreamingRetry
from .retry_streaming import retry_target_stream
from .retry_streaming_async import AsyncStreamingRetry
from .retry_streaming_async import retry_target_stream as retry_target_stream_async

__all__ = (
"exponential_sleep_generator",
"if_exception_type",
"if_transient_error",
"_build_retry_error",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're exposing this, should we remove the leading private underscore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, removed

"RetryFailureReason",
"Retry",
"AsyncRetry",
"StreamingRetry",
"AsyncStreamingRetry",
"retry_target",
"retry_target_async",
"retry_target_stream",
"retry_target_stream_async",
)
346 changes: 346 additions & 0 deletions google/api_core/retry/retry_base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,346 @@
# Copyright 2017 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Shared classes and functions for retrying requests.

:class:`_BaseRetry` is the base class for :class:`Retry`,
:class:`AsyncRetry`, :class:`StreamingRetry`, and :class:`AsyncStreamingRetry`.
"""

from __future__ import annotations

import logging
import random
import time

from enum import Enum
from typing import Any, Callable, TYPE_CHECKING

import requests.exceptions

from google.api_core import exceptions
from google.auth import exceptions as auth_exceptions

if TYPE_CHECKING:
import sys

if sys.version_info >= (3, 11):
from typing import Self
else:
from typing_extensions import Self

_DEFAULT_INITIAL_DELAY = 1.0 # seconds
_DEFAULT_MAXIMUM_DELAY = 60.0 # seconds
_DEFAULT_DELAY_MULTIPLIER = 2.0
_DEFAULT_DEADLINE = 60.0 * 2.0 # seconds

_LOGGER = logging.getLogger("google.api_core.retry")


def if_exception_type(
*exception_types: type[Exception],
) -> Callable[[Exception], bool]:
"""Creates a predicate to check if the exception is of a given type.

Args:
exception_types (Sequence[:func:`type`]): The exception types to check
for.

Returns:
Callable[Exception]: A predicate that returns True if the provided
exception is of the given type(s).
"""

def if_exception_type_predicate(exception: Exception) -> bool:
"""Bound predicate for checking an exception type."""
return isinstance(exception, exception_types)

return if_exception_type_predicate


# pylint: disable=invalid-name
# Pylint sees this as a constant, but it is also an alias that should be
# considered a function.
if_transient_error = if_exception_type(
exceptions.InternalServerError,
exceptions.TooManyRequests,
exceptions.ServiceUnavailable,
requests.exceptions.ConnectionError,
requests.exceptions.ChunkedEncodingError,
auth_exceptions.TransportError,
)
"""A predicate that checks if an exception is a transient API error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc: I'm fine with the function-type naming you have, but if Python treats this as a constant, we should make sure the documentation format is appropriate for that so it shows up in IDEs, etc. Is the triple-quoted form the right format for constants' comments? I only recall the hash-prefixed form. It seems triple-quoted strings get assigned to __doc__ when they are "the first statement in a module, function, class, or method definition" (https://peps.python.org/pep-0257/). Elsewhere, they may just get executed (i.e. printed out) when the interpreter encounters that line, which in this case would be on module load.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new code, I just moved the existing function to this new shared file

If we do changes here, I'd probably prefer to open an issue and address it separately, since it's not related to streaming retries. Let me know what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure. Could you file that issue? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened here: #569


The following server errors are considered transient:

- :class:`google.api_core.exceptions.InternalServerError` - HTTP 500, gRPC
``INTERNAL(13)`` and its subclasses.
- :class:`google.api_core.exceptions.TooManyRequests` - HTTP 429
- :class:`google.api_core.exceptions.ServiceUnavailable` - HTTP 503
- :class:`requests.exceptions.ConnectionError`
- :class:`requests.exceptions.ChunkedEncodingError` - The server declared
chunked encoding but sent an invalid chunk.
- :class:`google.auth.exceptions.TransportError` - Used to indicate an
error occurred during an HTTP request.
"""
# pylint: enable=invalid-name


def exponential_sleep_generator(
initial: float, maximum: float, multiplier: float = _DEFAULT_DELAY_MULTIPLIER
):
"""Generates sleep intervals based on the exponential back-off algorithm.

This implements the `Truncated Exponential Back-off`_ algorithm.

.. _Truncated Exponential Back-off:
https://cloud.google.com/storage/docs/exponential-backoff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc: This link only calls it "exponential backoff" (without "truncation" in the title). It links to the Wikipedia page, which does mention "truncated exponential backoff". However, what you're doing with the use of random is what I guess Wikipedia would call "truncated randomized exponential backoff". So I suggest:

  1. In the comment: call this "truncated randomized exponential backoff"
  2. Link to the Wikipedia article instead (or in addition to the current link).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also existing code moved over. So I'd prefer to open an issue for this instead of address it here if that's ok

FWIW, I've also heard this called "full jitter", and that seems to get a lot of google hits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure. Could you file that issue? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Args:
initial (float): The minimum amount of time to delay. This must
be greater than 0.
maximum (float): The maximum amount of time to delay.
multiplier (float): The multiplier applied to the delay.

Yields:
float: successive sleep intervals.
"""
delay = min(initial, maximum)
vchudnov-g marked this conversation as resolved.
Show resolved Hide resolved
while True:
yield random.uniform(0.0, delay)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tangent (NO-OP): Discussion for another day: I wonder whether it would be more useful to set the lower limit to some non-zero value, like delay/2. I'm guessing the problem has been studied and the answer is "no", but that's just a guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this code pre-dates the PR, and it seems like it was written to align with this retryPolicy spec

delay = min(delay * multiplier, maximum)


class RetryFailureReason(Enum):
"""
The cause of a failed retry, used when building exceptions
"""

TIMEOUT = 0
NON_RETRYABLE_ERROR = 1


def _build_retry_error(
exc_list: list[Exception],
reason: RetryFailureReason,
timeout_val: float | None,
**kwargs: Any,
) -> tuple[Exception, Exception | None]:
"""
Default exception_factory implementation. Builds an exception after the retry fails
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved

Args:
- exc_list: list of exceptions that occurred during the retry
- reason: reason for the retry failure.
Can be TIMEOUT or NON_RETRYABLE_ERROR
- timeout_val: the original timeout value for the retry, for use in the exception message
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved

Returns:
- tuple: a tuple of the exception to be raised, and the cause exception if any
"""
if reason == RetryFailureReason.TIMEOUT:
# return RetryError with the most recent exception as the cause
src_exc = exc_list[-1] if exc_list else None
timeout_val_str = f"of {timeout_val:0.1f}s " if timeout_val is not None else ""
return (
exceptions.RetryError(
f"Timeout {timeout_val_str}exceeded",
src_exc,
),
src_exc,
)
elif exc_list:
# return most recent exception encountered
return exc_list[-1], None
else:
# no exceptions were given in exc_list. Raise generic RetryError
return exceptions.RetryError("Unknown error", None), None


def _retry_error_helper(
exc: Exception,
deadline: float | None,
next_sleep: float,
error_list: list[Exception],
predicate_fn: Callable[[Exception], bool],
on_error_fn: Callable[[Exception], None] | None,
exc_factory_fn: Callable[
[list[Exception], RetryFailureReason],
tuple[Exception, Exception | None],
vchudnov-g marked this conversation as resolved.
Show resolved Hide resolved
],
):
"""
Shared logic for handling an error for all retry implementations

- Raises an error on timeout or non-retryable error
- Calls on_error_fn if provided
- Logs the error

Args:
- exc: the exception that was raised
- deadline: the deadline for the retry, calculated as a diff from time.monotonic()
- next_sleep: the calculated next sleep interval
- error_list: the list of exceptions that have been raised so far
- predicate_fn: the predicate that was used to determine if the exception should be retried
- on_error_fn: the callback that was called when the exception was raised
- exc_factory_fn: the callback that was called to build the exception to be raised on terminal failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- predicate_fn: the predicate that was used to determine if the exception should be retried
- on_error_fn: the callback that was called when the exception was raised
- exc_factory_fn: the callback that was called to build the exception to be raised on terminal failure
- predicate_fn: the predicate to determine whether the operation that raised `exc` should be retried (a return value of `None` signals not to retry)
- on_error_fn: the callback to invoke on `exc` if `predicate_fn` deemed the operation retryable
- exc_factory_fn: the callback to build the exception to be raised on terminal failure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the (a return value of None signals not to retry) part. The type annotation just says to return bool. Technically it should work with any value based on its truthiness, but I'm not sure if we want to commit to supporting extra types in the docstrings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see your point, but then we're not accurately describing how this function uses the preidcate. If a non-True truthy value is returned, the function will retry. How about 'predicate that takes exc and returns "true" if the operation should be retried'? Notice I lower-cased true to imply that it's actually truthiness (too subtle? lol), and I said "retry the operation", not "retry the exception".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that works for me, I changed the text

"""
error_list.append(exc)
if not predicate_fn(exc):
final_exc, source_exc = exc_factory_fn(
error_list,
RetryFailureReason.NON_RETRYABLE_ERROR,
)
raise final_exc from source_exc
if on_error_fn is not None:
on_error_fn(exc)
if deadline is not None and time.monotonic() + next_sleep > deadline:
final_exc, source_exc = exc_factory_fn(
error_list,
RetryFailureReason.TIMEOUT,
)
raise final_exc from source_exc
_LOGGER.debug(
"Retrying due to {}, sleeping {:.1f}s ...".format(error_list[-1], next_sleep)
)


class _BaseRetry(object):
"""
Base class for retry configuration objects. This class is intended to capture retry
and backoff configuration that is common to both synchronous and asynchronous retries,
for both unary and streaming RPCs. It is not intended to be instantiated directly,
but rather to be subclassed by the various retry configuration classes.
"""

def __init__(
self,
predicate: Callable[[Exception], bool] = if_transient_error,
vchudnov-g marked this conversation as resolved.
Show resolved Hide resolved
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
initial: float = _DEFAULT_INITIAL_DELAY,
maximum: float = _DEFAULT_MAXIMUM_DELAY,
multiplier: float = _DEFAULT_DELAY_MULTIPLIER,
timeout: float = _DEFAULT_DEADLINE,
on_error: Callable[[Exception], Any] | None = None,
**kwargs: Any,
) -> None:
self._predicate = predicate
self._initial = initial
self._multiplier = multiplier
self._maximum = maximum
self._timeout = kwargs.get("deadline", timeout)
self._deadline = self._timeout
self._on_error = on_error

def __call__(self, *args, **kwargs) -> Any:
raise NotImplementedError("Not implemented in base class")

@property
def deadline(self) -> float | None:
"""
DEPRECATED: use ``timeout`` instead. Refer to the ``Retry`` class
documentation for details.
"""
return self._timeout

@property
def timeout(self) -> float | None:
return self._timeout

def _replace(
self,
predicate: Callable[[Exception], bool] | None = None,
initial: float | None = None,
maximum: float | None = None,
multiplier: float | None = None,
timeout: float | None = None,
on_error: Callable[[Exception], Any] | None = None,
) -> Self:
return type(self)(
predicate=predicate or self._predicate,
initial=initial or self._initial,
maximum=maximum or self._maximum,
multiplier=multiplier or self._multiplier,
timeout=timeout or self._timeout,
on_error=on_error or self._on_error,
)

def with_deadline(self, deadline: float | None) -> Self:
"""Return a copy of this retry with the given timeout.

DEPRECATED: use :meth:`with_timeout` instead. Refer to the ``Retry`` class
documentation for details.

Args:
deadline (float): How long to keep retrying in seconds.
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved

Returns:
Retry: A new retry instance with the given timeout.
"""
return self._replace(timeout=deadline)

def with_timeout(self, timeout: float) -> Self:
"""Return a copy of this retry with the given timeout.

Args:
timeout (float): How long to keep retrying, in seconds.

Returns:
Retry: A new retry instance with the given timeout.
"""
return self._replace(timeout=timeout)

def with_predicate(self, predicate: Callable[[Exception], bool]) -> Self:
"""Return a copy of this retry with the given predicate.

Args:
predicate (Callable[Exception]): A callable that should return
``True`` if the given exception is retryable.

Returns:
Retry: A new retry instance with the given predicate.
"""
return self._replace(predicate=predicate)

def with_delay(
self,
initial: float | None = None,
maximum: float | None = None,
multiplier: float | None = None,
) -> Self:
"""Return a copy of this retry with the given delay options.

Args:
initial (float): The minimum amount of time to delay. This must
be greater than 0.
maximum (float): The maximum amount of time to delay.
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
multiplier (float): The multiplier applied to the delay.

Returns:
Retry: A new retry instance with the given predicate.
"""
return self._replace(initial=initial, maximum=maximum, multiplier=multiplier)

def __str__(self) -> str:
return (
"<{} predicate={}, initial={:.1f}, maximum={:.1f}, "
"multiplier={:.1f}, timeout={}, on_error={}>".format(
type(self).__name__,
self._predicate,
self._initial,
self._maximum,
self._multiplier,
self._timeout, # timeout can be None, thus no {:.1f}
self._on_error,
)
)
Loading