Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new get_catalog_relations macro, Supporting Changes #8648
Add new get_catalog_relations macro, Supporting Changes #8648
Changes from 9 commits
6b17896
22edf72
604df7e
4489822
6e2e4fc
9b8e15d
aedc0bc
38b356d
471d29d
a9addc0
703faff
d10e802
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting a placeholder here to discuss "feature flag-like things".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't put my finger on why, but I feel like this piece should be a method on
SchemaSearchMap
, something likeSchemaSearchMap.extend(relations)
. Then this becomes:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @mikealfare, following this up, since I didn't incorporate every one of your requested changes.
My thinking at the time was that the code I had was tested and working, so I didn't want continue tweaking relatively small details after we had already done a lot of collaboration and gone multiple rounds of review. I should have at least left a note to that effect.
However, if you think this or any of the other remaining issues ought to be addressed before 1.7.0, let me know and I am happy to open a follow up issue.
Check warning on line 423 in core/dbt/adapters/base/impl.py
Codecov / codecov/patch
core/dbt/adapters/base/impl.py#L421-L423
Check warning on line 428 in core/dbt/adapters/base/impl.py
Codecov / codecov/patch
core/dbt/adapters/base/impl.py#L428
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be more readable with a
defaultdict
. Untested pseudo code:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the name of this macro mean? Put another way, how do I know what this should return? It looks like this runs the new macro, but also limits it to objects in the manifest; is that right? This only gets called once, and in the branch of code where we're passing in an explicit list of relations that we got from the manifest. Is the filter still necessary?
I think there's value in wrapping the macro itself in an adapter method. But if we do need to filter it for some reason, then we should do that after this is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few observations:
self._get_catalog_relations()
twice, once for the length and again inside ofself._get_catalog_relations_by_info_schema
self._get_catalog_relations_by_info_schema
is doing two things, getting the relations and reformatting the resulting list into a dictionaryself._get_catalog_relations_by_info_schema
is only used once, hererelations_by_schema
is grouping by the information_schema, notrelation.schema
, which is misleadingrelations_by_schema
is created to filter a list of relationsPerhaps we could simplify this by reusing the results of
self._get_catalog_relations(manifest)
(let's refer to this asrelations
) and filteringrelations
byinfo_schema
while looping through theinfo_schema
values?Some pseudo code to add context (and to organize my own thoughts):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this function, it may be worth reviewing the state of the code today if you still have concerns. Gerda refactored it a bit, and made changes along the lines you are suggesting.
Check warning on line 1164 in core/dbt/adapters/base/impl.py
Codecov / codecov/patch
core/dbt/adapters/base/impl.py#L1159-L1164
Check warning on line 1167 in core/dbt/adapters/base/impl.py
Codecov / codecov/patch
core/dbt/adapters/base/impl.py#L1167
Check warning on line 462 in core/dbt/adapters/base/relation.py
Codecov / codecov/patch
core/dbt/adapters/base/relation.py#L462
Check warning on line 464 in core/dbt/adapters/base/relation.py
Codecov / codecov/patch
core/dbt/adapters/base/relation.py#L464
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the where clause is not reused, I would keep it in line here.
I made it a separate macro because I needed it twice (old implementation an new implementation). However, I think I like your approach at the adapter level of simply rerouting the existing macro to the new macro, avoiding the need for maintaining two versions of the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I thought
schemas
here was a list of strings (the schema names), and not a list ofBaseRelation
objects that act as schemas (by not definingidentifier
). If that's the case, the where clause produced below should return nothing (relation.identifier is "falsy" because of jinja things,upper(sch.nspname) = upper('{{ relation.schema }}')
fails, I think because it becomesupper(sch.nspname) = upper('')
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
schemas
parameter to this macro is indeed a list of strings. The loop below converts it to relations: List[obj] where the obj is acting like a BaseRelation (specifically a "schema relation") for the limited purpose of callingget_catalog_where_clause
.So when this macro is called with schemas = [ "schema_a", "schema_b", "schema_c" ], it will in turn call
get_catalog_where_clause
with relations = [ { "schema": "schema_a" }, { "schema": "schema_b" }, { "schema": "schema_c" } ].This is just to make it easier to reuse get_catalog_where_clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, does jinja treat
relation.schema
the same asrelation.get("schema")
orrelation["schema"]
?