feat: use custom exponential sleep generator #860

daniel-sanche · 2023-08-18T21:52:55Z

The conformance tests are currently flaky, because they expect sleeps between failed requests to always increase. By default, gapic libraries use a truncated backoff algorithm, where sleeps are drawn using random.uniform(0, max_sleep), with max_sleep increasing exponentially. This means that latter sleeps may end up being shorter than the one before

This PR addresses the test failure by using a custom backoff generator, which resembles the default, but with the added property that the lower bound also updates so that sleeps increase over time

#859

danieljbruce · 2023-10-06T21:30:18Z

google/cloud/bigtable/data/_async/_mutate_rows.py

@@ -89,21 +90,21 @@ def __init__(
            bt_exceptions._MutateRowsIncomplete,
        )
        # build retryable operation
-        retry = retries.AsyncRetry(
+        retry_wrapped = functools.partial(
+            retries.retry_target,


I have been looking at https://www.learnpython.org/en/Partial_functions to try to understand this a little bit better. I guess we are expecting this retries.retry_target to be in place of the retry function defined before?

Yes, this will be a little complicated without context.

AsyncRetry is just a wrapper around retries.retry_target. But AsynRetry doesn't expose the sleep_generator argument we need to customize here. So I'm replacing the AsyncRetry instance with the lower-level retries.retry_target.

The partial allows us to prepare the function's arguments without calling it yet. In the future, we can call retry_wrapped without any arguments, and it will call it with everything we passed into the partial here. This is useful for things like retries, where we may not always have all the context to re-build the entire operation when we want to launch a new attempt

danieljbruce

Look good. Just a few nits/cleanup suggested.

danieljbruce · 2023-10-16T17:43:26Z

google/cloud/bigtable/data/_helpers.py

@@ -62,6 +63,51 @@ def _attempt_timeout_generator(
        yield max(0, min(per_request_timeout, deadline - time.monotonic()))


+def _exponential_sleep_generator(


If the min increase is 0 and the multiplier is 1 then this would loop forever right? Maybe an error message for this case might be useful.

This is a generator function, so there is no risk of an infinite loop. The method "returns" at the yield line, only to pick up there again the next time next(this_generator) is called.

If the min increase is 0 and the multiplier is 1, it will continuously spit out the initial value every time you call next, but won't block

danieljbruce · 2023-10-16T17:48:27Z

tests/unit/data/test__helpers.py

+            ((1, 3, 10, 0.5), [1, 1.5, 2, 2.5, 3, 3]),  # test with larger multiplier
+            ((1, 25, 1.5, 5), [1, 6, 11, 16, 21, 25]),  # test with larger min increase
+            ((1, 5, 1, 0), [1, 1, 1, 1]),  # test with multiplier of 1
+            ((1, 5, 1, 1), [1, 2, 3, 4]),  # test with min_increase with multiplier of 1


nit: test with min_increase 1 and multiplier of 1.

danieljbruce · 2023-10-16T17:55:08Z

tests/unit/data/test__helpers.py

+        [
+            ((), [0.01, 0.02, 0.03, 0.04, 0.05]),  # test defaults
+            ((1, 3, 2, 1), [1, 2, 3, 3, 3]),  # test hitting limit
+            ((1, 3, 2, 0.5), [1, 1.5, 2, 2.5, 3, 3]),  # test with smaller min_increase


Why don't the tests seem to be using the multiplier?

the next_sleep value is calculated usingrandom.uniform(lower_bound, upper_bound). This test mocks the random function to always return the lower bound, to make testing easier (see line 101). The test below this mocks it to always return the upper bound, which is where you'll see the multiplier's effect

daniel-sanche · 2023-10-23T16:43:54Z

Holding off on this for now: There's talk of modifying the flaky test, which would remove the need for this change: googleapis/cloud-bigtable-clients-test#115 (comment)

liujiongxin · 2023-10-23T17:46:42Z

Holding off on this for now: There's talk of modifying the flaky test, which would remove the need for this change: googleapis/cloud-bigtable-clients-test#115 (comment)

It will be done to resolve the flakiness (today). Later, we can explore checking the jittering more reliably (e.g. doing more retries and observe the trend, etc. It's not easy/quick for sure)

daniel-sanche · 2023-10-24T21:34:31Z

It will be done to resolve the flakiness (today). Later, we can explore checking the jittering more reliably (e.g. doing more retries and observe the trend, etc. It's not easy/quick for sure)

Ok, it sounds like the jitter used by Python is an acceptable behaviour, so I will close this PR rather than changing the logic here then. Thanks!

daniel-sanche and others added 30 commits July 17, 2023 15:35

support emulator in data client

aa1e40e

added test proxy files

b059139

cleaned up noxfile

3ed2168

moved protos to subdir

5199839

close client

4e14bf3

moved handlers into subdir

cbb95c9

reverted close

f43aac1

removed user agent

91fc1e6

removed go submodule

06e5276

fixied typo

62b8e48

removed unneeded files

237e051

removed duplicate client handler legacy

868ff2e

Merge branch 'v3' into test_proxy2

21a5077

addressed PR comments

02f0c09

ran blacken

456caba

Merge branch 'v3' into test_proxy2

f3627c1

fix method name

bcc02d7

added missing import

5e7c156

added conformance tests to kokoro

604d3d8

added conformance to nox sessions

14f359d

Revert unwwanted noxfile changes

858c57d

added missing run_tests file

36a3153

changed conformance test kokoro configs

07b39b1

ran blacken

5d90478

install golang for conformance tests

b69da5a

update before attempting install

df3ea47

changed go install method

8dcd444

moved go installation to run_tests

94a8684

fixed failing conformance tests

72b8d1b

Merge branch 'v3' into test_proxy2

8496211

daniel-sanche added 7 commits August 17, 2023 16:00

fixed read rows test error

71ba0ea

fixed conformance test errors

94e98db

download go locally instead of installing to system

320d157

fixed lint issue

5064870

added custom sleep generator

b0aafec

use custom sleep generator for all rpc calls

23aa2fd

added unit tests

375afaf

daniel-sanche requested review from a team as code owners August 18, 2023 21:52

product-auto-label bot added size: xl Pull request size is extra large. api: bigtable Issues related to the googleapis/python-bigtable API. labels Aug 18, 2023

Merge branch 'v3' into custom_sleep_gen

72ecf96

product-auto-label bot added size: m Pull request size is medium. and removed size: xl Pull request size is extra large. labels Aug 18, 2023

daniel-sanche added 2 commits August 18, 2023 15:00

changed arg order for readability

187eeba

added test for strict increase

85445f0

daniel-sanche changed the base branch from v3 to experimental_v3 August 30, 2023 20:39

danieljbruce reviewed Oct 6, 2023

View reviewed changes

fixed lint issues

1995a1f

danieljbruce approved these changes Oct 16, 2023

View reviewed changes

changed test comment

4eab6a6

daniel-sanche mentioned this pull request Oct 18, 2023

Backoff timing verification can flake googleapis/cloud-bigtable-clients-test#115

Closed

daniel-sanche closed this Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use custom exponential sleep generator #860

feat: use custom exponential sleep generator #860

daniel-sanche commented Aug 18, 2023

danieljbruce Oct 6, 2023

daniel-sanche Oct 12, 2023

danieljbruce left a comment

danieljbruce Oct 16, 2023

daniel-sanche Oct 16, 2023

danieljbruce Oct 16, 2023

danieljbruce Oct 16, 2023

daniel-sanche Oct 16, 2023

daniel-sanche commented Oct 23, 2023

liujiongxin commented Oct 23, 2023

daniel-sanche commented Oct 24, 2023

		@@ -62,6 +63,51 @@ def _attempt_timeout_generator(
		yield max(0, min(per_request_timeout, deadline - time.monotonic()))


		def _exponential_sleep_generator(

feat: use custom exponential sleep generator #860

feat: use custom exponential sleep generator #860

Conversation

daniel-sanche commented Aug 18, 2023

danieljbruce Oct 6, 2023

Choose a reason for hiding this comment

daniel-sanche Oct 12, 2023

Choose a reason for hiding this comment

danieljbruce left a comment

Choose a reason for hiding this comment

danieljbruce Oct 16, 2023

Choose a reason for hiding this comment

daniel-sanche Oct 16, 2023

Choose a reason for hiding this comment

danieljbruce Oct 16, 2023

Choose a reason for hiding this comment

danieljbruce Oct 16, 2023

Choose a reason for hiding this comment

daniel-sanche Oct 16, 2023

Choose a reason for hiding this comment

daniel-sanche commented Oct 23, 2023

liujiongxin commented Oct 23, 2023

daniel-sanche commented Oct 24, 2023