Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement prefix + DBT selector graph operator for model selection #429

Closed

Conversation

tseruga
Copy link
Contributor

@tseruga tseruga commented Aug 1, 2023

Description

This PR adds a commonly used graph operator for model selection in DBT.
https://docs.getdbt.com/reference/node-selection/graph-operators#the-plus-operator

This is just a first stab at implementing (what I assume is) the most commonly used graph operator. It just so happens that it's also the easiest to implement.

This change allows users to specify the DBT model they wish to select, as well as its upstream dependencies.

select=["+my_model"] will return the my_model node as well as any of the nodes that it depends_on, and any nodes those nodes depend on, etc.

Haven't written up the docs because I want to make sure folks are happy with handling the graph operators this way. It might limit future implementation of the remaining operators, so happy to hear feedback on the topic.

This is just what I hacked together while trying to play around with Cosmos post-1.0.0 :)

NOTE: I also cherry-picked #426 in order to get the tests to work, but will rebase when required.

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@netlify
Copy link

netlify bot commented Aug 1, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit a1cdd19
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/652996e19e5c6900085a0df5

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (5ae38f6) 93.13% compared to head (a1cdd19) 93.16%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #429      +/-   ##
==========================================
+ Coverage   93.13%   93.16%   +0.03%     
==========================================
  Files          51       51              
  Lines        2053     2079      +26     
==========================================
+ Hits         1912     1937      +25     
- Misses        141      142       +1     
Files Coverage Δ
cosmos/dbt/selector.py 97.12% <96.55%> (-0.23%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tseruga tseruga force-pushed the custom-selector-upstream-model branch from 8f6f778 to 85499f1 Compare August 2, 2023 15:10
@tseruga
Copy link
Contributor Author

tseruga commented Aug 2, 2023

Added another test to hopefully make Coverage happy -- and rebased off of main.

@tatiana
Copy link
Collaborator

tatiana commented Oct 10, 2023

@tseruga, this work was looking really good. It would be great if we could rebase it and get to the finishing line! Please let us know if you need any support.

@tseruga
Copy link
Contributor Author

tseruga commented Oct 10, 2023

Sure thing. This got dropped for other priorities, but I can spend some time in the coming days/weeks to try to wrap this up

@tseruga tseruga force-pushed the custom-selector-upstream-model branch from f40629b to 74d1b8f Compare October 10, 2023 18:47
@tseruga tseruga temporarily deployed to external October 10, 2023 18:47 — with GitHub Actions Inactive
@tseruga tseruga temporarily deployed to external October 12, 2023 18:16 — with GitHub Actions Inactive
@tatiana tatiana added this to the 1.3.0 milestone Oct 13, 2023
@tatiana tatiana temporarily deployed to external October 13, 2023 12:34 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to external October 13, 2023 19:13 — with GitHub Actions Inactive
@tatiana
Copy link
Collaborator

tatiana commented Oct 13, 2023

Thanks a lot, @tseruga ! Please let us know when the PR is ready for review - I've assigned it for the 1.3 release, and we can review it if necessary.

@tatiana tatiana added the area:selector Related to selector, like DAG selector, DBT selector, etc label Nov 9, 2023
@tatiana
Copy link
Collaborator

tatiana commented Nov 30, 2023

@tseruga Since this PR was stale, and we had recent customer requests for this feature, I followed up on the missing bits and created a new PR: #728. You're added as a co-author - please feel free to review and give feedback!

@tatiana tatiana closed this Nov 30, 2023
tatiana added a commit that referenced this pull request Dec 5, 2023
…OM` or `LoadMode.DBT_MANIFEST` (#728)

Add support for the following when using `LoadMode.CUSTOM` or
`LoadMode.DBT_MANIFEST`:

* Support selection of model by name
* Support the selection of models by name & their children (with or
without degrees)
* Support the selection of models by name & their parents (with or
without degrees)
* Support intersections and unions involving graph selectors (with or
without other supported selectors, eg. tags)

Examples of select/exclusion statements that now work regardless of the
`LoadMode` being used:

```
model_a
+model_b
model_c+
+model_d+
2+model_e
model_f+3
model_f+,tag:nightly
```

Related dbt documentation:
https://docs.getdbt.com/reference/node-selection/graph-operators
https://docs.getdbt.com/reference/node-selection/set-operators

Limitations:
* The at operator is not supported yet (`@`)
* If users opt to use graph selector, it will increase the DAG parsing
time and the task execution time when using `LoadMode.CUSTOM` or
`LoadMode.DBT_MANIFEST`


This PR improves and extends the original implementation proposed by
@tseruga in #429. Some of the changes that were introduced on top of the
original PR:
* Add support to descendants (before only precursors were supported)
* Add support to different depths/degrees of precursors/descendants
* Add support to the union between graph operators and graph/non-graph
operators
* Add support to the intersection between graph operators and
graph/non-graph operators

Closes: #684

Co-authored-by: Tyler Seruga <[email protected]>
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
…OM` or `LoadMode.DBT_MANIFEST` (astronomer#728)

Add support for the following when using `LoadMode.CUSTOM` or
`LoadMode.DBT_MANIFEST`:

* Support selection of model by name
* Support the selection of models by name & their children (with or
without degrees)
* Support the selection of models by name & their parents (with or
without degrees)
* Support intersections and unions involving graph selectors (with or
without other supported selectors, eg. tags)

Examples of select/exclusion statements that now work regardless of the
`LoadMode` being used:

```
model_a
+model_b
model_c+
+model_d+
2+model_e
model_f+3
model_f+,tag:nightly
```

Related dbt documentation:
https://docs.getdbt.com/reference/node-selection/graph-operators
https://docs.getdbt.com/reference/node-selection/set-operators

Limitations:
* The at operator is not supported yet (`@`)
* If users opt to use graph selector, it will increase the DAG parsing
time and the task execution time when using `LoadMode.CUSTOM` or
`LoadMode.DBT_MANIFEST`


This PR improves and extends the original implementation proposed by
@tseruga in astronomer#429. Some of the changes that were introduced on top of the
original PR:
* Add support to descendants (before only precursors were supported)
* Add support to different depths/degrees of precursors/descendants
* Add support to the union between graph operators and graph/non-graph
operators
* Add support to the intersection between graph operators and
graph/non-graph operators

Closes: astronomer#684

Co-authored-by: Tyler Seruga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:selector Related to selector, like DAG selector, DBT selector, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants