-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[10/n][dagster-fivetran] Implement fivetran_assets and build_fivetran_assets_definitions #25944
base: master
Are you sure you want to change the base?
Conversation
FivetranWorkspaceData
to FivetranConnectorTableProps
method
#25797
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_defs.py
Show resolved
Hide resolved
2500b07
to
3f0e9aa
Compare
python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py
Outdated
Show resolved
Hide resolved
bd27402
to
357ff3f
Compare
6cbe4f4
to
0521c7f
Compare
def sync_and_poll( | ||
self, context: Optional[Union[OpExecutionContext, AssetExecutionContext]] = None | ||
): | ||
raise NotImplementedError() | ||
|
||
def __eq__(self, other): | ||
return ( | ||
isinstance(other, FivetranWorkspace) | ||
and self.account_id == other.account_id | ||
and self.api_key == other.api_key | ||
and self.api_secret == other.api_secret | ||
) | ||
|
||
def __hash__(self): | ||
return hash((self.account_id + self.api_key + self.api_secret)) | ||
|
||
|
||
@lru_cache(maxsize=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching load_fivetran_asset_specs
with functools.lru_cache
requires FivetranWorkspace
to be hashable.
An alternative to caching load_fivetran_asset_specs
would be to call it in a FivetranWorkspace
cached property.
class FivetranWorkspace(ConfigurableResource):
...
@cache_property
def asset_specs(
self,
dagster_fivetran_translator: Type[DagsterFivetranTranslator] = DagsterFivetranTranslator
) -> Sequence[AssetSpec]:
return load_fivetran_asset_specs(
workspace=self,
dagster_fivetran_translator=dagster_fivetran_translator
)
For the context, after this and this discussions, we decided to detach spec loading from the resources, which is why I cached load_fivetran_asset_specs
instead of caching a property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @benpankow @dpeng817 @schrockn @alangenfeld @OwenKephart for feedback on this specific comment and the PR description about caching the asset specs calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i bias towards @cached_method
on an object instance instead of @lru_cache(maxsize=None)
on a top level function - I think its much easier to reason about when things are expected to be cached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we went with cached_property for the FivetranWorkspace
equivalent in Airlift - I think that's likely the right move here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to a cached method in FivetranWorkspace
in 501caf7
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_decorator.py
Outdated
Show resolved
Hide resolved
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_decorator.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... don't think we should add these. Reasoning:
- All other cases where we added new decorators were before we converged upon a new API design - and I think likely weren't considered holistically. The
@dbt_assets
decorator is an exception, but we didn't haveAssetSpec
yet. If we could go back and do it again, I have some doubts we would have added any of them (although there is some messy transformational stuff that it needs to do, so maybe I'm wrong) - We can always add it later. But if we add it now we're stuck supporting it and eventually adding arguments for all the asset spec things (like group_name here). Eventually you'll need to support some sort of splitting API, etc. I just think it's better to point users towards a more genericizable solution.
I'm also not sure I understand why we need both the decorator and build_fivetran_assets_definitions? Is the idea that it just provides progressively easier starting points?
If the actual function implementations were doing more complicated stuff, maybe I would see the additional value but these are pretty simple wrappers, thus I think we should hold off.
c5911bf
to
db028f1
Compare
9f2b4ff
to
1cc89f8
Compare
db028f1
to
a773b5e
Compare
1cc89f8
to
ebca057
Compare
a773b5e
to
74db3e2
Compare
ebca057
to
4f7cd9a
Compare
74db3e2
to
c18340e
Compare
4f7cd9a
to
5f1b3f8
Compare
c18340e
to
3c9a56a
Compare
5f1b3f8
to
f78ad58
Compare
3c9a56a
to
48a3bb0
Compare
f78ad58
to
44865a4
Compare
48a3bb0
to
cd041b0
Compare
44865a4
to
a83dfc6
Compare
748298f
to
538348f
Compare
…d_fivetran_assets_definitions factory
538348f
to
e97963d
Compare
Summary & Motivation
This PR implements the
fivetran_assets
decorator and thebuild_fivetran_assets_definitions
factory.fivetran_assets
can be used to create all assets for a given Fivetran connector, i.e. one asset per table in the connector.build_fivetran_assets_definitions
can be used to create all Fivetran assets defs, one per connector. It usesfivetran_assets
.Both the asset decorator and factory use
load_fivetran_asset_specs
. This is motivated by the current implementation ofdagster-dbt
,dagster-dlt
anddagster-sling
- each leverages an asset decorator that loads the asset specs by itself.To avoid calling the Fivetran API each time
load_fivetran_asset_specs
is called, it is cached usingfunctools.lru_cache
.load_fivetran_asset_specs
uses the state-backed defs, so reloading the code won't make additional calls to the Fivetran API, but callingload_fivetran_asset_specs
N times in a script will make N calls to the Fivetran API.The goals here are:
How I Tested These Changes
Additional unit tests with BK.
Changelog
[dagster-fivetran] The
fivetran_assets
decorator is added. It can be used with theFivetranWorkspace
resource andDagsterFivetranTranslator
translator to load Fivetran tables for a given connector as assets in Dagster. Thebuild_fivetran_assets_definitions
factory can be used to create assets for all the connectors in your Fivetran workspace.