added retry policy param to dbt assets decorator #18990

askvinni · 2024-01-03T12:50:52Z

Summary & Motivation

Noticed the parameter for retry policies is currently missing from the dbt assets decorator.

How I Tested These Changes

Test suite, added test for the new attribute.

rexledesma

Requesting changes for your feedback: I'm wondering if this is the right abstraction we should be exposing to our users to accomplish a "retry" in dbt.

To my knowledge, there are three cases in which a retry could be triggered:

Syntax error
Business logic error (e.g. a test assertion is failing)
Connection flakiness

An existing Dagster retry policy accommodates (3), at the expense of (1) and (2). The entire materialization function will be run again, only for the user to encounter (1) and (2) again. (3) is accommodated, but ideally, the materialization function should only run from the point of failure (the flaky dbt model/test execution). Otherwise, a retry could potentially be incredibly expensive.

With the emergence of dbt retry (link), I think this retry is better served if users handle the retry on their own, in their decorated function. We should add documentation on how to accomplish this. This retry occurs from the point of failure, which alleviates the concerns about using the built-in Dagster retry policy.

askvinni · 2024-01-03T18:35:48Z

Hey Rex, thanks for the comment. I was actually not aware of the dbt retry command. The purpose of this PR really was to solve flaky connections that we've been seeing recently. I agree that retrying with the command should be left to the users, so I'll just close this.

rexledesma · 2024-01-03T18:39:37Z

@askvinni Great, I'll add some documentation on the dbt retry capability in Dagster.

rexledesma · 2024-01-04T14:44:45Z

Important

You'll need to be on dbt-core>=1.7.9, which adds --target-path support to dbt retry: https://github.com/dbt-labs/dbt-core/releases/tag/v1.7.9.

As a breadcrumb for anyone who see this pull request, I'm providing a small code snippet to add dbt retry logic in the existing decorator.

If the dbt command fails, we issue a dbt retry on exception. This dbt retry takes parameters from the previously failed command (e.g. manifest, dagster_dbt_translator, and target_path) so that the retry can use the previous command's dbt artifacts to execute properly.

from dataclasses import replace

from dagster import AssetExecutionContext
from dagster_dbt import DbtCliResource, dbt_assets


@dbt_assets(manifest=dbt_manifest_path)
def jaffle_shop_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    dbt_invocation = dbt.cli(["build"], context=context)
    try:
        yield from dbt_invocation.stream()
    except:
        dbt_retry_invocation = dbt.cli(
            ["retry"],
            manifest=dbt_invocation.manifest,
            dagster_dbt_translator=dbt_invocation.dagster_dbt_translator,
            target_path=dbt_invocation.target_path,
        )
        dbt_retry_invocation = replace(dbt_retry_invocation, context=context)
        
        yield from dbt_retry_invocation.stream()

askvinni · 2024-01-10T12:58:26Z

Hey @rexledesma, I finally got around to trying this and sadly, it seems like the dbt retry command doesn't respect the target-path flag set. It's a known issue. Just wanted to flag this here in case someone else comes across the same thing.

rexledesma · 2024-01-10T14:55:56Z

@askvinni Did you try the code snippet that I provided above? It doesn't use the --target-path argument, but instead sets the DBT_TARGET_PATH env var programmatically.

It works on my machine:

❯ dbt --version
Core:
  - installed: 1.7.4
  - latest:    1.7.4 - Up to date!

Plugins:
  - duckdb:    1.7.0 - Up to date!

Running the following commands on a modified jaffle_shop with a failing test produces the expected retry: the retry starts from the failed test, which fails again.

jaffle_shop on  main [!?] 🐍 (dagster) took 3s
❯ DBT_TARGET_PATH=target/new-path dbt build
14:52:07  Running with dbt=1.7.4
14:52:08  Registered adapter: duckdb=1.6.0
14:52:08  Unable to do partial parsing because saved manifest not found. Starting full parse.
14:52:09  Found 5 models, 3 seeds, 20 tests, 0 sources, 0 exposures, 0 metrics, 391 macros, 0 groups, 0 semantic models
14:52:09
14:52:09  Concurrency: 24 threads (target='dev')
14:52:09
14:52:09  1 of 28 START seed file main.raw_customers ..................................... [RUN]
14:52:09  2 of 28 START seed file main.raw_orders ........................................ [RUN]
14:52:09  3 of 28 START seed file main.raw_payments ...................................... [RUN]
14:52:09  3 of 28 OK loaded seed file main.raw_payments .................................. [INSERT 113 in 0.06s]
14:52:09  2 of 28 OK loaded seed file main.raw_orders .................................... [INSERT 99 in 0.07s]
14:52:09  1 of 28 OK loaded seed file main.raw_customers ................................. [INSERT 100 in 0.07s]
14:52:09  4 of 28 START sql view model main.stg_payments ................................. [RUN]
14:52:09  5 of 28 START sql view model main.stg_orders ................................... [RUN]
14:52:09  6 of 28 START sql view model main.stg_customers ................................ [RUN]
14:52:09  5 of 28 OK created sql view model main.stg_orders .............................. [OK in 0.07s]
14:52:09  7 of 28 START test accepted_values_stg_orders_status__nope ..................... [RUN]
14:52:09  8 of 28 START test not_null_stg_orders_order_id ................................ [RUN]
14:52:09  6 of 28 OK created sql view model main.stg_customers ........................... [OK in 0.07s]
14:52:09  9 of 28 START test unique_stg_orders_order_id .................................. [RUN]
14:52:09  4 of 28 OK created sql view model main.stg_payments ............................ [OK in 0.08s]
14:52:09  10 of 28 START test not_null_stg_customers_customer_id ......................... [RUN]
14:52:09  11 of 28 START test unique_stg_customers_customer_id ........................... [RUN]
14:52:09  12 of 28 START test accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card  [RUN]
14:52:09  13 of 28 START test not_null_stg_payments_payment_id ........................... [RUN]
14:52:09  14 of 28 START test unique_stg_payments_payment_id ............................. [RUN]
14:52:09  8 of 28 PASS not_null_stg_orders_order_id ...................................... [PASS in 0.08s]
14:52:09  7 of 28 FAIL 5 accepted_values_stg_orders_status__nope ......................... [FAIL 5 in 0.08s]
14:52:09  9 of 28 PASS unique_stg_orders_order_id ........................................ [PASS in 0.08s]
14:52:09  11 of 28 PASS unique_stg_customers_customer_id ................................. [PASS in 0.07s]
14:52:09  10 of 28 PASS not_null_stg_customers_customer_id ............................... [PASS in 0.07s]
14:52:09  13 of 28 PASS not_null_stg_payments_payment_id ................................. [PASS in 0.07s]
14:52:09  12 of 28 PASS accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card  [PASS in 0.07s]
14:52:09  14 of 28 PASS unique_stg_payments_payment_id ................................... [PASS in 0.07s]
14:52:09  15 of 28 SKIP relation main.customers .......................................... [SKIP]
14:52:09  16 of 28 SKIP relation main.orders ............................................. [SKIP]
14:52:09  17 of 28 SKIP test not_null_customers_customer_id .............................. [SKIP]
14:52:09  18 of 28 SKIP test unique_customers_customer_id ................................ [SKIP]
14:52:09  19 of 28 SKIP test accepted_values_orders_status__placed__shipped__completed__return_pending__returned  [SKIP]
14:52:09  20 of 28 SKIP test not_null_orders_amount ...................................... [SKIP]
14:52:09  21 of 28 SKIP test not_null_orders_bank_transfer_amount ........................ [SKIP]
14:52:09  22 of 28 SKIP test not_null_orders_coupon_amount ............................... [SKIP]
14:52:09  23 of 28 SKIP test not_null_orders_credit_card_amount .......................... [SKIP]
14:52:09  24 of 28 SKIP test not_null_orders_customer_id ................................. [SKIP]
14:52:09  25 of 28 SKIP test not_null_orders_gift_card_amount ............................ [SKIP]
14:52:09  26 of 28 SKIP test not_null_orders_order_id .................................... [SKIP]
14:52:09  27 of 28 SKIP test relationships_orders_customer_id__customer_id__ref_customers_  [SKIP]
14:52:09  28 of 28 SKIP test unique_orders_order_id ...................................... [SKIP]
14:52:09
14:52:09  Finished running 3 seeds, 3 view models, 20 tests, 2 table models in 0 hours 0 minutes and 0.31 seconds (0.31s).
14:52:09
14:52:09  Completed with 1 error and 0 warnings:
14:52:09
14:52:09  Failure in test accepted_values_stg_orders_status__nope (models/staging/schema.yml)
14:52:09    Got 5 results, configured to fail if != 0
14:52:09
14:52:09    compiled Code at target/new-path/compiled/jaffle_shop/models/staging/schema.yml/accepted_values_stg_orders_status__nope.sql
14:52:09
14:52:09  Done. PASS=13 WARN=0 ERROR=1 SKIP=14 TOTAL=28

jaffle_shop on  main [!?] 🐍 (dagster) took 3s
❯ DBT_TARGET_PATH=target/new-path dbt retry
14:52:15  Running with dbt=1.7.4
14:52:15  Registered adapter: duckdb=1.6.0
14:52:15  Warning: The state and target directories are the same: 'target'. This could lead to missing changes due to overwritten state including non-idempotent retries.
14:52:15  Found 5 models, 3 seeds, 20 tests, 0 sources, 0 exposures, 0 metrics, 391 macros, 0 groups, 0 semantic models
14:52:15
14:52:15  Concurrency: 24 threads (target='dev')
14:52:15
14:52:15  1 of 15 START test accepted_values_stg_orders_status__nope ..................... [RUN]
14:52:15  1 of 15 FAIL 5 accepted_values_stg_orders_status__nope ......................... [FAIL 5 in 0.03s]
14:52:15  2 of 15 SKIP relation main.customers ........................................... [SKIP]
14:52:15  3 of 15 SKIP relation main.orders .............................................. [SKIP]
14:52:15  4 of 15 SKIP test not_null_customers_customer_id ............................... [SKIP]
14:52:15  5 of 15 SKIP test unique_customers_customer_id ................................. [SKIP]
14:52:15  6 of 15 SKIP test accepted_values_orders_status__placed__shipped__completed__return_pending__returned  [SKIP]
14:52:15  7 of 15 SKIP test not_null_orders_amount ....................................... [SKIP]
14:52:15  8 of 15 SKIP test not_null_orders_bank_transfer_amount ......................... [SKIP]
14:52:15  9 of 15 SKIP test not_null_orders_coupon_amount ................................ [SKIP]
14:52:15  10 of 15 SKIP test not_null_orders_credit_card_amount .......................... [SKIP]
14:52:15  11 of 15 SKIP test not_null_orders_customer_id ................................. [SKIP]
14:52:15  12 of 15 SKIP test not_null_orders_gift_card_amount ............................ [SKIP]
14:52:15  13 of 15 SKIP test not_null_orders_order_id .................................... [SKIP]
14:52:15  14 of 15 SKIP test relationships_orders_customer_id__customer_id__ref_customers_  [SKIP]
14:52:15  15 of 15 SKIP test unique_orders_order_id ...................................... [SKIP]
14:52:15
14:52:15  Finished running 13 tests, 2 table models in 0 hours 0 minutes and 0.09 seconds (0.09s).
14:52:15
14:52:15  Completed with 1 error and 0 warnings:
14:52:15
14:52:15  Failure in test accepted_values_stg_orders_status__nope (models/staging/schema.yml)
14:52:15    Got 5 results, configured to fail if != 0
14:52:15
14:52:15    compiled Code at target/compiled/jaffle_shop/models/staging/schema.yml/accepted_values_stg_orders_status__nope.sql
14:52:15
14:52:15  Done. PASS=0 WARN=0 ERROR=1 SKIP=14 TOTAL=15

askvinni · 2024-01-12T10:59:16Z

@rexledesma this might be something that's fixed in 1.7.*, and we're running 1.6.9. Either way, my solution was to copy the run_results.json into the target/ folder, that way the retry did manage to read it. It's odd behavior and the workaround isn't exactly pretty, but it's good enough.

Baksbany22 · 2024-01-17T14:53:00Z

@askvinni Can you give a more detailed description of how you solved this problem?

askvinni · 2024-01-17T20:47:46Z

@Baksbany22 dbt has a target folder where it generates its manifest.json and other files, usually called target/ from the root where your dbt_project.yaml file resides. Dagster's default behavior is to create a subfolder within the target/ folder with a UUID (e.g. target/my_dbt_assets-12345) for each CLI invocation. That's the folder the run_results.json, dbt.log, and other files are created in. My woraround is essentially to copy the target/my_dbt_assets-12345/run_results.json file into target/run_results.json, that way the dbt retry command did manage to find the file and execute properly.

toddy86 · 2024-02-14T19:15:40Z

@rexledesma I have tried this as well with the same result as @askvinni

rexledesma · 2024-02-14T19:57:36Z

Are you on dbt-core==1.6.*? Is it possible for you to upgrade to dbt-core==1.7.*?

If not, could you try Vinni's workaround?

Baksbany22 · 2024-02-15T11:17:00Z

@toddy86 There are two ways to solve this problem:

When calling dbt.cli, specify target_path. Then each launch of dbt models will not create a new directory: command = dbt.cli(["build"], context=context, target_path = dbt_target_path)
The second solution to the problem is shown above. You need to copy run_results from the created directory to the "target" directory. It works fine on my local machine, but on our server we got the error: "source_file does not exist". Solved with time.sleep:

@dbt_assets(manifest=dbt_manifest_path)
def test_bi_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    command = dbt.cli(["build"], context=context) #, target_path = dbt_target_path)
    try:
        yield from command.stream()
    except:
        time.sleep(10)
        source_file = os.path.join(command.target_path, "run_results.json")
        if os.path.exists(source_file):
            shutil.copy(source_file, r'/opt/dagster/app/dbt_project/target/')
        else:
            raise Exception(f'source_file does not exists')
        yield from dbt.cli(
            ["retry"],
            manifest=command.manifest,
            dagster_dbt_translator=command.dagster_dbt_translator,
            target_path=command.target_path,
        ).stream()

toddy86 · 2024-02-19T12:18:11Z

Are you on dbt-core==1.6.*? Is it possible for you to upgrade to dbt-core==1.7.*?

If not, could you try Vinni's workaround?

I’m on dbt 1.7.x.

It isn’t critical for us to have these retries on individual dbt assets. So we are testing just brute forcing this with a job level retry. Which I haven’t tested yet, but I’m presuming the job level retry will only pick up the failed dbt assets (perhaps a wrong assumption and we shall see).

the4thamigo-uk · 2024-02-26T15:11:49Z

Also hitting this issue : noticed this
dbt-labs/dbt-core@d1e400e

the4thamigo-uk · 2024-03-01T14:23:34Z

https://github.com/dbt-labs/dbt-core/releases/tag/v1.7.9

toddy86 · 2024-03-03T03:36:43Z

https://github.com/dbt-labs/dbt-core/releases/tag/v1.7.9

Can confirm this is now working as expected after bumping to dbt v.1.7.9

rexledesma · 2024-03-04T16:09:33Z

Thanks for confirming the fix @the4thamigo-uk and @toddy86. I've updated #18990 (comment) with a disclaimer to be on dbt-core>=1.7.9.

toddy86 · 2024-03-08T12:18:47Z

@rexledesma There is a hidden gremlin in using the dbt retry command if you aren't on the look for it.

If you have a job which splits the dbt asset materializations into multiple steps (e.g. a job with partitioned and non-partitioned assets), and the parent task initially fails and retries, then some of the downstream dbt tasks can be skipped as Dagster interprets the upstream assets as being skipped.

Initial run where some models succeeded, but others failed

The dbt retry kicks in and all failed and skipped models are successfully built on the second try

Downstream dbt steps are incorrectly skipped, as assets successfully materialized in the retry run are incorrectly labelled as skipped

rexledesma · 2024-03-08T15:47:46Z

@toddy86 Are you yielding Dagster events from the dbt retry? You need to be doing this. Just want to sanity check that you're calling yield from!

@dbt_assets(manifest=dbt_manifest_path)
def jaffle_shop_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    dbt_invocation = dbt.cli(["build"], context=context)
    try:
        yield from dbt_invocation.stream()
    except:
+       yield from dbt.cli(
            ["retry"],
            manifest=dbt_invocation.manifest,
            dagster_dbt_translator=dbt_invocation.dagster_dbt_translator,
            target_path=dbt_invocation.target_path,
        ).stream()

toddy86 · 2024-03-09T09:27:00Z

@rexledesma yep, we are yielding. Full dbt asset code below (we have a thin wrapper around the dbt_assets)

def build_dbt_assets(  # noqa: PLR0913
    select: str = "fqn:*",
    exclude: str = "",
    mode: str = "build",
    name: Optional[str] = None,
    partitions_def: Optional[PartitionsDefinition] = None,
    backfill_policy: Optional[BackfillPolicy] = None,
    dbt_retry: bool = False,
) -> list[AssetsDefinition]:

    _exclude = exclude + " tag:exclude_dagster"
    @dbt_assets(
        name=name,
        manifest=dbt_manifest_path,
        select=select,
        exclude=_exclude,
        partitions_def=partitions_def,
        backfill_policy=backfill_policy,
        dagster_dbt_translator=CustomDagsterDbtTranslator(
            settings=DagsterDbtTranslatorSettings(enable_asset_checks=True),
        ),
    )
    def _assets(
        context: OpExecutionContext,
    ):
        dbt_build_args = [mode]
        if partitions_def:
            dbt_vars = {
                "start_date": context.partition_key_range.start,
                "end_date": context.partition_key_range.end,
            }
            dbt_build_args.extend(["--vars", json.dumps(dbt_vars)])

        command = dbt_resource.cli(dbt_build_args, context=context)
        try:
            yield from command.stream()
        except:  # noqa: E722
            if dbt_retry:
                yield from dbt_resource.cli(
                    ["retry"],
                    manifest=command.manifest,
                    dagster_dbt_translator=command.dagster_dbt_translator,
                    target_path=command.target_path,
                ).stream()
            else:
                raise
    return [_assets]

rexledesma · 2024-03-11T03:54:01Z

@toddy86 Ah, I think this is because we need to pass context into the dbt retry so that it generates Output events instead of AssetMaterialization events. However, once we do this, we'll also try to add the subsetting selection arguments to dbt retry, which is not ideal.

Here's a workaround (under test in #20395) to ensure that the dbt invocation doesn't have the subsetting arguments, but the emitted events still use the context argument:

+ from dataclasses import replace

from dagster import AssetExecutionContext
from dagster_dbt import DbtCliResource, dbt_assets


@dbt_assets(manifest=dbt_manifest_path)
def jaffle_shop_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    dbt_invocation = dbt.cli(["build"], context=context)
    try:
        yield from dbt_invocation.stream()
    except:
+       dbt_retry_invocation = dbt.cli(
+           ["retry"],
+           manifest=dbt_invocation.manifest,
+           dagster_dbt_translator=dbt_invocation.dagster_dbt_translator,
+           target_path=dbt_invocation.target_path,
+       )
+       dbt_retry_invocation = replace(dbt_retry_invocation, context=context)
+       
+       yield from dbt_retry_invocation.stream()

On my end, I'll see if I can have a fix out so we don't need to call replace.

…vents (#20395) ## Summary & Motivation Put #18990 (comment) under test. ## How I Tested These Changes pytest

toddy86 · 2024-03-14T13:44:54Z

Thanks @rexledesma. I’m on leave for a few weeks, but I’ll give this a try once I’m back.

…vents (#20395) ## Summary & Motivation Put #18990 (comment) under test. ## How I Tested These Changes pytest

the4thamigo-uk · 2024-05-13T08:13:11Z

Hi @rexledesma, I am trying to get this to work. I am seeing the following though :

dagster._core.errors.DagsterInvariantViolationError: Compute for op "dbt_clickhouse_unpartitioned" returned an output "staging__stg_my_test__validation" multiple times

Stack Trace:
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_plan.py", line 282, in dagster_event_sequence_for_step
    for step_event in check.generator(step_events):
,  File "/usr/local/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 523, in core_dagster_event_sequence_for_step
    for user_event in _step_output_error_checked_user_event_sequence(
,  File "/usr/local/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 222, in _step_output_error_checked_user_event_sequence
    raise DagsterInvariantViolationError(

In this case a dbt test failed causing a dbt retry which failed with the above error. I am replacing the context as in your example. Do you see a similar issue?

rexledesma · 2024-05-13T19:12:47Z

@the4thamigo-uk I assume you're modeling your dbt tests as Dagster asset checks (cc @johannkm)

If that's the case, then what happened is:

Your dbt invocation emitted AssetCheckResult's, with passed=False to represent a failed dbt test. Call this test A.
This caused the dbt invocation to fail, preparing it for a retry.
In the retry, the failed test A was retried, and a new AssetCheckResult event was emitted for it. However, an event for A had already been emitted.

If you want to do this retry scheme with Dagster asset checks, you'll need to ensure that the failed tests in (1) are not emitted in the event stream. Only the final result from the dbt retry should be emitted.

emirkmo · 2024-06-07T09:14:58Z

@rexledesma Out of curiosity would calling dbt retry instead of the actual dbt command work if say context.retry_number != 0? I am especially thinking of when yielding AssetCheckResults for dbt tests (currently we hold back failed check results until the last retry) and trying to integrate dbt retry with Dagster retries.

Something like:

@dbt_assets(...)
def dbt_assets(context: AssetExecutionContext):


    dbt_command = "build" if context.retry_number == 0 else "retry"
    dbt_cli_invocation = dbt.cli([dbt_command], context=context, raise_on_error=True)
    try:
        yield from dbt_cli_invocation.stream():
    except DagsterDbtCliRuntimeError as err:
        raise RetryRequested(max_retries=1, seconds_to_wait=300) from err

Currently we do

@dbt_assets(...)
def dbt_assets(context: AssetExecutionContext):

    dbt_cli_invocation = dbt.cli(["build"], context=context, raise_on_error=True)
    failed_test_events = {}
    try:
        for dagster_event in dbt_cli_invocation.stream():
            if isinstance(dagster_event, AssetCheckResult) and not dagster_event.passed:
                failed_test_events[dagster_event.check_name] = dagster_event
                continue
            yield dagster_event
        if failed_test_events:
            # Only some failed tests, if something else failed, it would have already raised before getting here.
            raise DagsterDbtCliRuntimeError(description="failed_tests")
    except DagsterDbtCliRuntimeError as err:
 
        # Save run_results before retry potentially overwrites it.
        build_run_results = dbt_cli_invocation.get_artifact("run_results.json")

        dbt_retry_invocation = dbt.cli(
            ["retry"],
            manifest=dbt_cli_invocation.manifest,
            dagster_dbt_translator=dbt_cli_invocation.dagster_dbt_translator,
            target_path=dbt_cli_invocation.target_path,
        )
        dbt_retry_invocation = replace(dbt_retry_invocation, context=context)

        # (Technically you can add another try/catch and invoke builtin Dagster retry in case the issue is Network related.)
        yield from dbt_retry_invocation.stream()

Since we deploy on K8s with docker run_results etc. are not overwritten by another run (and I know nowadayas dbt_assets saves to unique target folder anyway) so dbt_retry can be ran just fine even if other dbt commands ran in between the retry.

rexledesma · 2024-06-07T12:32:39Z

Out of curiosity would calling dbt retry instead of the actual dbt command work if say context.retry_number != 0?

Some thoughts that come to mind:

You would need to ensure that your dbt retry references the target path containing the previous dbt invocation's artifacts (e.g. run_results.json).
You should ensure that the artifacts persist across retries.

lokofoko · 2024-06-24T09:09:07Z

Hi @rexledesma. To better understand how this works, please explain if using your snippet, will run retry on the first run of the code. Because I would expect to have it run only when I re-execute my job from failure, not just regularly.

I am not sure how dagster parses that function, but from plain python side it looks like retry would be executed always whenever there is an error and I don't understand the use case for this.

rexledesma · 2024-06-24T14:56:48Z

@lokofoko See #18990 (review) on what is happening here.

The point is that we are doing a dbt retry without needing to invoke do a re-execution from failure, because if the initial run failed because of connection flakiness, you can just do the retry within the same run. You shouldn't need to spin up a new run.

the4thamigo-uk · 2024-10-21T09:12:34Z

@the4thamigo-uk I assume you're modeling your dbt tests as Dagster asset checks (cc @johannkm)

If that's the case, then what happened is:

Your dbt invocation emitted AssetCheckResult's, with passed=False to represent a failed dbt test. Call this test A.

This caused the dbt invocation to fail, preparing it for a retry.

In the retry, the failed test A was retried, and a new AssetCheckResult event was emitted for it. However, an event for A had already been emitted.

If you want to do this retry scheme with Dagster asset checks, you'll need to ensure that the failed tests in (1) are not emitted in the event stream. Only the final result from the dbt retry should be emitted.

@rexledesma Yes I think this is what happened. Can you provide an example of how to do this? Perhaps we need a final canonical example posted in this issue, or ideally in the docs?

G14rb · 2024-11-14T16:18:52Z

@the4thamigo-uk to materialize AssetObservation instead of AssetCheckResult you have to set the settings property of DagsterDbtTranslator enable_asset_checks to False

from dataclasses import replace

from dagster import AssetExecutionContext
from dagster_dbt import DbtCliResource, dbt_assets, DagsterDbtTranslatorSettings

dagster_dbt_translator = DagsterDbtTranslator(
    settings=DagsterDbtTranslatorSettings(enable_asset_checks=False)
)

@dbt_assets(manifest=dbt_manifest_path, dagster_dbt_translator=dagster_dbt_translator)
def jaffle_shop_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    dbt_invocation = dbt.cli(["build"], context=context)
    try:
        yield from dbt_invocation.stream()
    except:
        dbt_retry_invocation = dbt.cli(
            ["retry"],
            manifest=dbt_invocation.manifest,
            dagster_dbt_translator=dbt_invocation.dagster_dbt_translator,
            target_path=dbt_invocation.target_path,
        )
        dbt_retry_invocation = replace(dbt_retry_invocation, context=context)
        
        yield from dbt_retry_invocation.stream()

the4thamigo-uk · 2024-11-20T09:12:24Z

@the4thamigo-uk to materialize AssetObservation instead of AssetCheckResult you have to set the settings property of DagsterDbtTranslator enable_asset_checks to False

Thanks for the code above. However this means we never generate AssetCheckResult, but I thought you were suggesting in the earlier post that we should still emit it, but only once, either in the first invocation (if it succeeds), or the retry if it repeatedly fails.

you'll need to ensure that the failed tests in (1) are not emitted in the event stream. Only the final result from the dbt retry should be emitted

askvinni added 2 commits January 3, 2024 13:49

added retry policy arg to dbt assets decorator

e99c10c

formatting

e0b7764

askvinni changed the title ~~added retry policy arg to dbt assets decorator~~ added retry policy param to dbt assets decorator Jan 3, 2024

rexledesma self-requested a review January 3, 2024 17:48

rexledesma suggested changes Jan 3, 2024

View reviewed changes

askvinni closed this Jan 3, 2024

askvinni deleted the dbt-retry-policy branch January 3, 2024 18:43

rexledesma mentioned this pull request Jan 4, 2024

Re-run from failure should rerun from the failed asset, not the failed step (when possible) #12423

Open

garethbrickman added the integration: dbt Related to dagster-dbt label Feb 14, 2024

rexledesma mentioned this pull request Mar 11, 2024

test(dbt): assert that dbt retry can be invoked to yield Output events #20395

Merged

rexledesma added a commit that referenced this pull request Mar 11, 2024

test(dbt): assert that dbt retry can be invoked to yield Output e…

87afb91

…vents (#20395) ## Summary & Motivation Put #18990 (comment) under test. ## How I Tested These Changes pytest

PedramNavid pushed a commit that referenced this pull request Mar 28, 2024

test(dbt): assert that dbt retry can be invoked to yield Output e…

effa151

…vents (#20395) ## Summary & Motivation Put #18990 (comment) under test. ## How I Tested These Changes pytest

rexledesma mentioned this pull request May 29, 2024

pass through retry_policy for dbt assets #22143

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added retry policy param to dbt assets decorator #18990

added retry policy param to dbt assets decorator #18990

askvinni commented Jan 3, 2024 •

edited

Loading

rexledesma left a comment

askvinni commented Jan 3, 2024

rexledesma commented Jan 3, 2024

rexledesma commented Jan 4, 2024 •

edited

Loading

askvinni commented Jan 10, 2024

rexledesma commented Jan 10, 2024 •

edited

Loading

askvinni commented Jan 12, 2024 •

edited

Loading

Baksbany22 commented Jan 17, 2024

askvinni commented Jan 17, 2024

toddy86 commented Feb 14, 2024

rexledesma commented Feb 14, 2024

Baksbany22 commented Feb 15, 2024 •

edited

Loading

toddy86 commented Feb 19, 2024

the4thamigo-uk commented Feb 26, 2024

the4thamigo-uk commented Mar 1, 2024

toddy86 commented Mar 3, 2024

rexledesma commented Mar 4, 2024

toddy86 commented Mar 8, 2024 •

edited

Loading

rexledesma commented Mar 8, 2024 •

edited

Loading

toddy86 commented Mar 9, 2024 •

edited by rexledesma

Loading

rexledesma commented Mar 11, 2024 •

edited

Loading

toddy86 commented Mar 14, 2024

the4thamigo-uk commented May 13, 2024

rexledesma commented May 13, 2024

emirkmo commented Jun 7, 2024

rexledesma commented Jun 7, 2024

lokofoko commented Jun 24, 2024

rexledesma commented Jun 24, 2024

the4thamigo-uk commented Oct 21, 2024

G14rb commented Nov 14, 2024 •

edited

Loading

the4thamigo-uk commented Nov 20, 2024 •

edited

Loading

added retry policy param to dbt assets decorator #18990

added retry policy param to dbt assets decorator #18990

Conversation

askvinni commented Jan 3, 2024 • edited Loading

Summary & Motivation

How I Tested These Changes

rexledesma left a comment

Choose a reason for hiding this comment

askvinni commented Jan 3, 2024

rexledesma commented Jan 3, 2024

rexledesma commented Jan 4, 2024 • edited Loading

askvinni commented Jan 10, 2024

rexledesma commented Jan 10, 2024 • edited Loading

askvinni commented Jan 12, 2024 • edited Loading

Baksbany22 commented Jan 17, 2024

askvinni commented Jan 17, 2024

toddy86 commented Feb 14, 2024

rexledesma commented Feb 14, 2024

Baksbany22 commented Feb 15, 2024 • edited Loading

toddy86 commented Feb 19, 2024

the4thamigo-uk commented Feb 26, 2024

the4thamigo-uk commented Mar 1, 2024

toddy86 commented Mar 3, 2024

rexledesma commented Mar 4, 2024

toddy86 commented Mar 8, 2024 • edited Loading

rexledesma commented Mar 8, 2024 • edited Loading

toddy86 commented Mar 9, 2024 • edited by rexledesma Loading

rexledesma commented Mar 11, 2024 • edited Loading

toddy86 commented Mar 14, 2024

the4thamigo-uk commented May 13, 2024

rexledesma commented May 13, 2024

emirkmo commented Jun 7, 2024

rexledesma commented Jun 7, 2024

lokofoko commented Jun 24, 2024

rexledesma commented Jun 24, 2024

the4thamigo-uk commented Oct 21, 2024

G14rb commented Nov 14, 2024 • edited Loading

the4thamigo-uk commented Nov 20, 2024 • edited Loading

askvinni commented Jan 3, 2024 •

edited

Loading

rexledesma commented Jan 4, 2024 •

edited

Loading

rexledesma commented Jan 10, 2024 •

edited

Loading

askvinni commented Jan 12, 2024 •

edited

Loading

Baksbany22 commented Feb 15, 2024 •

edited

Loading

toddy86 commented Mar 8, 2024 •

edited

Loading

rexledesma commented Mar 8, 2024 •

edited

Loading

toddy86 commented Mar 9, 2024 •

edited by rexledesma

Loading

rexledesma commented Mar 11, 2024 •

edited

Loading

G14rb commented Nov 14, 2024 •

edited

Loading

the4thamigo-uk commented Nov 20, 2024 •

edited

Loading