Message-processing design patterns #114

dandavison · 2024-05-22T16:20:22Z

This PR contains proposals for how the Python SDK could help users defer update processing, control interleaving of handler coroutines, and ensure processing is complete before workflow completion.

update/job_runner_I1.py and update/job_runner_I2.py show how users can do this themselves, with minimal SDK changes.

update/job_runner_I1_native.py and update/job_runner_I2_native.py show how the SDK could be modified to make this easier and less error-prone for users.

cretz · 2024-05-22T20:20:06Z

signals_and_updates/serialized_handling_of_n_messages.py

+# contain multiple yield points.
+
+
+class Queue(asyncio.Queue[tuple[Arg, asyncio.Future[Result]]]):


Probably wouldn't make a whole separate class for this nor would I extend asyncio.Queue. A simple deque or list would be fine I think (may have some wait_conditions but I think it's a bit simpler than this)

cretz · 2024-05-22T20:22:45Z

signals_and_updates/serialized_handling_of_n_messages.py

+            arg, fut = await self.queue.get()
+            fut.set_result(await self.process_task(arg))


I think an await self.process_task(*self.queue.get()) that doesn't return until the update has returned is best. This will get rid of the footgun. Yes this makes it two-phase: you have to tell the update its result and you have to wait for the update to return its result.

cretz · 2024-05-22T21:42:23Z

signals_and_updates/serialized_handling_of_n_messages.py

+        await workflow.wait_condition(lambda: hasattr(self, "queue"))
+        fut = asyncio.Future[Result]()
+        self.queue.put_nowait((arg, fut))  # Note: update validation gates enqueue
+        return await fut


This use case is a bit flawed. Do you care about update failing on continue as new? If you do then you wouldn't carry over queue items, if you don't then you wouldn't need the asyncio.sleep(0). As it is right now, this fails updates that are processing, but if your update activity just so happened to finish in the same task as continue as new suggested, then it doesn't fail the update.

cretz · 2024-05-22T22:26:41Z

signals_and_updates/serialized_handling_of_n_messages.py

Here are three alternatives that are untested, I just typed real quick, but the ideas are there

Same as original design (CAN w/ update failures)

from collections import deque from dataclasses import dataclass, field from datetime import timedelta from typing import Optional from temporalio import workflow # !!! # This is a copy of the original design, but it is flawed because it fails # updates to the caller on CAN # !!! @dataclass class MessageProcessorInput: pending_tasks: list[str] = field(default_factory=list) class UpdateTask: def __init__(self, arg: str) -> None: self.arg = arg self.result: Optional[str] = None self.returned = False @workflow.defn class MessageProcessor: def __init__(self) -> None: self.queue: deque[UpdateTask] = deque() @workflow.run async def run(self, input: MessageProcessorInput) -> None: # Startup self.queue.extendleft([UpdateTask(arg) for arg in input.pending_tasks]) # Process until CAN is needed while not workflow.info().is_continue_as_new_suggested(): await workflow.wait_condition(lambda: len(self.queue) > 0) await self.process_task(self.queue.popleft()) # CAN knowing that pending updates will fail workflow.continue_as_new( MessageProcessorInput(pending_tasks=[task.arg for task in self.queue]) ) @workflow.update async def do_task(self, arg: str) -> str: # Add task and wait on result task = UpdateTask(arg) try: self.queue.append(task) await workflow.wait_condition(lambda: task.result is not None) assert task.result return task.result finally: task.returned = True async def process_task(self, task: UpdateTask) -> None: task.result = await workflow.execute_activity( "some_activity", task.arg, start_to_close_timeout=timedelta(seconds=10) ) await workflow.wait_condition(lambda: task.returned)

Update doesn't wait

from collections import deque from datetime import timedelta from temporalio import workflow # !!! # This version does not make update wait on completion # !!! @workflow.defn class MessageProcessor: def __init__(self) -> None: self.queue: deque[str] = deque() @workflow.run async def run(self, queue: deque[str]) -> None: self.queue.extendleft(queue) # Process until CAN is needed while not workflow.info().is_continue_as_new_suggested(): await workflow.wait_condition(lambda: len(self.queue) > 0) await self.process_task(self.queue.popleft()) workflow.continue_as_new(self.queue) @workflow.update async def do_task(self, arg: str) -> None: # Put on queue and complete update self.queue.append(arg) async def process_task(self, arg: str) -> None: await workflow.execute_activity( "some_activity", arg, start_to_close_timeout=timedelta(seconds=10) )

Update must complete with result

from collections import deque from datetime import timedelta from typing import Optional from temporalio import workflow # !!! # This version requires update to complete with result and won't CAN until after # everything is done # !!! class UpdateTask: def __init__(self, arg: str) -> None: self.arg = arg self.result: Optional[str] = None self.returned = False @workflow.defn class MessageProcessor: def __init__(self) -> None: self.queue: deque[UpdateTask] = deque() @workflow.run async def run(self) -> None: # Process until CAN is needed and the queue is empty while not workflow.info().is_continue_as_new_suggested() or len(self.queue) > 0: await workflow.wait_condition(lambda: len(self.queue) > 0) await self.process_task(self.queue.popleft()) # CAN knowing the queue is empty workflow.continue_as_new() @workflow.update async def do_task(self, arg: str) -> str: # Add task and wait on result task = UpdateTask(arg) try: self.queue.append(task) await workflow.wait_condition(lambda: task.result is not None) assert task.result return task.result finally: task.returned = True async def process_task(self, task: UpdateTask) -> None: task.result = await workflow.execute_activity( "some_activity", task.arg, start_to_close_timeout=timedelta(seconds=10) ) await workflow.wait_condition(lambda: task.returned)

In the third case, "Update must complete with result", how to guarantee that CAN eventually happens if updates keep on coming...? We probably need to reject before enqueueing (with a validator?) new requests when we are trying to drain. Otherwise, history exceeds threshold, workflow fails, updates fail, and we are back to the problem of the first case...

In the third case, "Update must complete with result", how to guarantee that CAN eventually happens if updates keep on coming...?

You can't. You'd only choose this use case if you were ensured a period of idleness between tasks. You can definitely have a version that rejects updates when is_continue_as_new_suggested() is present if you wanted.

This reverts commit cdfa4c9a50436642047116a7792fdcb3d2650ac3.

@drewhoskins

cc @drewhoskins

dandavison force-pushed the message-processing-design-patterns branch 4 times, most recently from 2885070 to d8f4468 Compare May 22, 2024 20:22

cretz reviewed May 22, 2024

View reviewed changes

dandavison changed the title ~~Message processing design patterns~~ Message-processing design patterns May 23, 2024

dandavison force-pushed the message-processing-design-patterns branch 7 times, most recently from f0fd183 to fb92159 Compare May 29, 2024 16:46

dandavison and others added 17 commits May 29, 2024 12:46

reuse policy

5b44135

Signal and update design patterns

abec6c0

Carry state over CAN

6d2df90

Serialized, non-ordered handling of n messages

0f91210

Add return values to WashAndDryCycle sample

8060e50

rename directory

06e56c2

Whitespace

53fd1af

Appease type checker

67f70ee

Silence warnings in 3rd party code

3790247

Tests

cdd24e8

Use non-blocking queue and drain it before CAN

511ce73

Cleanup

59d0c69

Queue can be created before workflow start

cccaf6e

Cleanup

4ba0d00

cleanup

2988e44

Add explanatory comment

62f5848

Clarify and abstract

6bba635

dandavison and others added 7 commits May 29, 2024 12:46

Clarify and abstract II

59e53a5

Atomic Message Handlers w/ Stateful Workflow sample

dcb8f38

Rename variables task -> job for consistency with API

d5c1e51

Fix type annotations

c0b1dc4

Add some documentation of ClusterManager

965f1f9

Add @cretz's example solution

ee47f1c

New APIs

2c82d05

dandavison force-pushed the message-processing-design-patterns branch from 8d50d9c to b107e4d Compare May 29, 2024 21:36

dandavison added 9 commits May 30, 2024 04:09

Delete non-current work

5166484

Job runner notes

efd8a52

Job runner base

00c77a8

Job runner I1

16835af

Job runner I2

887b729

Modify I2 for native

4f86b4c

Job runner I2 native base

a75fb69

Start sketching I2 native

158eb78

Job runner I1 native base

87052fb

dandavison force-pushed the message-processing-design-patterns branch from b107e4d to be50780 Compare May 30, 2024 13:40

dandavison added 4 commits May 30, 2024 10:07

Start sketching I1 native

1473af6

Try to prototype new decorator

c73b8b7

Revert "Try to prototype new decorator"

7064b7a

This reverts commit cdfa4c9a50436642047116a7792fdcb3d2650ac3.

Continue sketching

0f335b4

dandavison force-pushed the message-processing-design-patterns branch from be50780 to 0f335b4 Compare May 30, 2024 15:29

dandavison added 4 commits May 30, 2024 12:05

Add top-level explanatory comments to files

d2cf21f

rm notes

bd9fc67

Introduce max_concurrent

569eed1

Resurrect potentially useful additional sample with test

139100d

cc @drewhoskins

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message-processing design patterns #114

Message-processing design patterns #114

dandavison commented May 22, 2024 •

edited

Loading

cretz May 22, 2024

cretz May 22, 2024

cretz May 22, 2024

cretz May 22, 2024 •

edited by dandavison

Loading

antlai-temporal May 22, 2024 •

edited

Loading

cretz May 23, 2024

		# contain multiple yield points.


		class Queue(asyncio.Queue[tuple[Arg, asyncio.Future[Result]]]):

		arg, fut = await self.queue.get()
		fut.set_result(await self.process_task(arg))

Message-processing design patterns #114

Are you sure you want to change the base?

Message-processing design patterns #114

Conversation

dandavison commented May 22, 2024 • edited Loading

cretz May 22, 2024

Choose a reason for hiding this comment

cretz May 22, 2024

Choose a reason for hiding this comment

cretz May 22, 2024

Choose a reason for hiding this comment

cretz May 22, 2024 • edited by dandavison Loading

Choose a reason for hiding this comment

Update must complete with result

antlai-temporal May 22, 2024 • edited Loading

Choose a reason for hiding this comment

cretz May 23, 2024

Choose a reason for hiding this comment

dandavison commented May 22, 2024 •

edited

Loading

cretz May 22, 2024 •

edited by dandavison

Loading

antlai-temporal May 22, 2024 •

edited

Loading