[Bug] Can't pass `args` for `run-operation` in programmatic invocation as kwarg #10355

ben-schreiber · 2024-06-23T13:58:11Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Passing the --args parameter as a kwarg to the dbtRunner.invoke method throws a TypeError. This forces the parameter to only be pass as:

dbtRunner().invoke(['run-operation', '--args', '...'])

This looks to be caused by a naming conflict with the invoke method here

Expected Behavior

To also be able to pass it as a kwarg

dbtRunner().invoke(['run-operation'], args=...)

Steps To Reproduce

from dbt.cli.main import dbtRunner

runner = dbtRunner()
runner.invoke(['run-operation', 'some_macro'], args={'arg1': 'foo', 'arg2': 'goo'})

Relevant log output

TypeError
...
TypeError: dbtRunner.invoke() got multiple values for argument 'args'

Environment

- OS: MacOS Sonoma 14.4
- Python: 3.10.12
- dbt: 1.8.3

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

The text was updated successfully, but these errors were encountered:

ben-schreiber · 2024-06-26T12:20:32Z

Here are two suggestions which came to mind:

The args parameter could be renamed. Something like,

def invoke(self, invocation_args: List[str], **kwargs) -> dbtRunnerResult: ...

Although, this would be a breaking change and would need to be handled accordingly.

To add a dedicated name for this use-case (e.g. macro_args or something like that) which then would be translated back to args

def invoke(self, args: List[str], **kwargs) -> dbtRunnerResult:
    if 'macro_args' in kwargs:
        kwargs['args'] = kwargs.pop('macro_args')

This too would have to be appropriately documented since it breaks from the kwargs convention of this method.

dbeatty10 · 2024-07-19T15:31:41Z

Thanks for reporting this along with the root cause and some ideas @ben-schreiber !

Root cause

Agreed that the cause is the naming conflict between the positional parameter args in the invoke method and the --args CLI flag for the run-operation subcommand.

Workaround

Pass in strings for the --args CLI flag like this:

runner.invoke(
    ["run-operation", "some_macro", "--args", "{'arg1': 'foo', 'arg2': 'goo'}", "--vars", "{'some_var': 'doog'}"]
)

Reprex and examples

macros/some_macro.sql

{% macro some_macro(arg1, arg2) %}
    {{ log("arg1: " ~ arg1, True) }}
    {{ log("arg2: " ~ arg2, True) }}
    {{ log("some_var: " ~ var("some_var", "loo"), True) }}
{% endmacro %}

some_script.py

from dbt.cli.main import dbtRunner

runner = dbtRunner()

# Works
runner.invoke(
    ["run-operation", "some_macro", "--args", "{'arg1': 'foo', 'arg2': 'goo'}", "--vars", "{'some_var': 'doog'}"]
)

# Also works
runner.invoke(
    args=["run-operation", "some_macro", "--args", "{'arg1': 'foo', 'arg2': 'goo'}", "--vars", "{'some_var': 'doo'}"]
)

# # Doesn't work due to naming conflict with the `args` parameter
# runner.invoke(["run-operation", "some_macro"], args={"arg1": "foo", "arg2": "goo"}, vars={"some_var": "doo"})

dbt run-operation some_macro --args "{'arg1': 'foo', 'arg2': 'goo'}" --vars "{'some_var': 'doo'}" 
python some_script.py

Monkey patch example

While creating a monkey patch isn't what we'd recommend or advocate for in this scenario, sharing this code example for sake of completeness.

from typing import List
from dbt.cli.main import dbtRunnerResult
from dbt.cli import requires
from click.exceptions import BadOptionUsage
from click.exceptions import NoSuchOption, UsageError
from click.exceptions import Exit as ClickExit
from dbt.cli.main import cli

from dbt.cli.main import dbtRunner

runner = dbtRunner()


# Define a new version of the invoke method
def new_invoke(self, invocation_args: List[str], **kwargs) -> dbtRunnerResult:

    try:
        dbt_ctx = cli.make_context(cli.name, invocation_args.copy())
        dbt_ctx.obj = {
            "manifest": self.manifest,
            "callbacks": self.callbacks,
        }

        for key, value in kwargs.items():
            dbt_ctx.params[key] = value
            # Hack to set parameter source to custom string
            dbt_ctx.set_parameter_source(key, "kwargs")  # type: ignore

        result, success = cli.invoke(dbt_ctx)
        return dbtRunnerResult(
            result=result,
            success=success,
        )
    except requires.ResultExit as e:
        return dbtRunnerResult(
            result=e.result,
            success=False,
        )
    except requires.ExceptionExit as e:
        return dbtRunnerResult(
            exception=e.exception,
            success=False,
        )
    except (BadOptionUsage, NoSuchOption, UsageError) as e:
        return dbtRunnerResult(
            exception=DbtUsageException(e.message),
            success=False,
        )
    except ClickExit as e:
        if e.exit_code == 0:
            return dbtRunnerResult(success=True)
        return dbtRunnerResult(
            exception=DbtInternalException(f"unhandled exit code {e.exit_code}"),
            success=False,
        )
    except BaseException as e:
        return dbtRunnerResult(
            exception=e,
            success=False,
        )


# Monkey patch the invoke method
dbtRunner.invoke = new_invoke

# The rest of your code goes below

dbeatty10 · 2024-07-19T15:32:29Z

Options

Rename the args parameter of invoke to something unique (e.g., invocation_args) accepting that it will break for anyone using the positional argument as a named parameter.
Add a dedicated name for this use-case (e.g. macro_args) accepting the complications in the code and user experience
Leave things as-is accepting that using the --vars flag for run-operation requires the workaround above or monkey patching the definition of invoke.

The first option of giving it a unique name like invocation_args seems most attractive to me. I'm guessing that it would be uncommon for folks to use a named parameter within calls to invoke(), and none of our code examples use the named parameter.

If we went this route, we would call it out in a migration guide. We'd also want to take care to choose a name that is unique enough that we'd be unlikely to ever add a CLI flag with the same name. See here for an example script that can get the names of all the CLI flags.

@ben-schreiber To help us understand the trade-offs here, could you share more about the effect on the user experience if we leave it as-is and users only have the workaround you mentioned?

ben-schreiber · 2024-07-22T07:12:14Z

@dbeatty10 I agree with you that option one is ideal. I would extend it even further to update the method's signature to include /:

def invoke(self, invocation_args: List[str], /, **kwargs) -> dbtRunnerResult: ...

thereby formalizing the desire to separate the usage of invocation_args (positional only) from kwargs (keyword only) in code. This would also allow the invocation_args argument to be be given a unique name without worrying as much about readability/usability for end users.

In regards to your question, the effect is not terribly large as long as this edge case is clearly documented. I encountered the issue when building a wrapper around DBT's programmatic invocations for Apache Airflow. In that use case, passing keyword arguments to invoke is more natural than converting Pythonic objects to the CLI representation.

ben-schreiber · 2024-07-22T07:20:26Z

Also, here is another/shorter monkeypatch:

from typing import List
from dbt.cli.main import dbtRunnerResult, dbtRunner

invoke_v1 = dbtRunner.invoke

def invoke_v2(self, invocation_args: List[str], /, **kwargs) -> dbtRunnerResult:
    args_copy = list(invocation_args)
    kwargs_copy = dict(kwargs)
    if 'args' in kwargs_copy:
         args_copy += ['--args', kwargs_copy.pop('args')]
    return invoke_v1(self, args_copy, **kwargs_copy)

dbtRunner.invoke = invoke_v2

ben-schreiber added bug Something isn't working triage labels Jun 23, 2024

dbeatty10 added the python_api Issues related to dbtRunner Python entry point label Jun 25, 2024

dbeatty10 added awaiting_response and removed triage labels Jul 19, 2024

github-actions bot added triage and removed awaiting_response labels Jul 22, 2024

dbeatty10 added the user docs [docs.getdbt.com] Needs better documentation label Jul 22, 2024

dbeatty10 linked a pull request Jul 22, 2024 that will close this issue

Support passing args as keyword argument for run-operation in programmatic invocations #10473

Open

5 tasks

dbeatty10 added Refinement Maintainer input needed and removed triage labels Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Can't pass `args` for `run-operation` in programmatic invocation as kwarg #10355

[Bug] Can't pass `args` for `run-operation` in programmatic invocation as kwarg #10355

ben-schreiber commented Jun 23, 2024 •

edited by dbeatty10

Loading

ben-schreiber commented Jun 26, 2024

dbeatty10 commented Jul 19, 2024

Reprex and examples

Monkey patch example

dbeatty10 commented Jul 19, 2024

ben-schreiber commented Jul 22, 2024

ben-schreiber commented Jul 22, 2024

[Bug] Can't pass args for run-operation in programmatic invocation as kwarg #10355

[Bug] Can't pass args for run-operation in programmatic invocation as kwarg #10355

Comments

ben-schreiber commented Jun 23, 2024 • edited by dbeatty10 Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

ben-schreiber commented Jun 26, 2024

dbeatty10 commented Jul 19, 2024

Root cause

Workaround

Reprex and examples

Monkey patch example

dbeatty10 commented Jul 19, 2024

Options

ben-schreiber commented Jul 22, 2024

ben-schreiber commented Jul 22, 2024

[Bug] Can't pass `args` for `run-operation` in programmatic invocation as kwarg #10355

[Bug] Can't pass `args` for `run-operation` in programmatic invocation as kwarg #10355

ben-schreiber commented Jun 23, 2024 •

edited by dbeatty10

Loading