-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CT 2251 model yaml frontmatter #7100
base: main
Are you sure you want to change the base?
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
@gshank A drive-by comment, based on this message:
I don't think we should support this! I think you should only be allowed to have YAML in the SQL file itself or a schema file:
(To reiterate, this is just what I reckon, not an official spec) |
I wasn't clear enough in my comment... I didn't mean that we'd support doing config in both places, I mean figuring out how to detect that it was happening and issue an error.... |
4c1f814
to
dc2b73a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use https://github.com/eyeseast/python-frontmatter instead of parsing front matter on our own?
than model file name.
@aranke It sounds like that package isn't doing a whole lot, and has some undesirable behaviors. Quoting @jakebiesinger in #6853 (reply in thread):
|
That was the same conclusion that I came to after looking at it. It's doing more than we want and the actually relevant code is pretty small. |
This only implements yaml frontmatter for model nodes. Would we want to also support it for other node types? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only implements yaml frontmatter for model nodes. Would we want to also support it for other node types?
If we're thinking about this as a proof of concept, model
is definitely the node type most worth proving out :)
We could extend this to other node types that are 1:1 with a file (singular tests, analyses), mostly for the sake of consistency. I think resources that are defined as Jinja blocks (snapshots, macros), where you can have multiple in a file, would be tricky. Even though they're one-to-a-file, I think seeds would also be tricky, and probably not the right UX.
Not sure if you also tested with Python models... but it (mostly) "just works"! Which is pretty darn cool.
---
description: "This is my dbt-py model"
config:
alias: my_cool_alias
---
def model(dbt, session):
df = dbt.ref("my_model")
return df
core/dbt/contracts/graph/nodes.py
Outdated
# with the same name as the model file, which won't work. So change the | ||
# directory to model file name + .yaml. | ||
file_path = self.original_file_path | ||
if file_path.endswith(".sql"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Need to check for
".py"
here as well, for Python models - Maybe it should just write to the same directory that the model is defined in? Need to think a bit more about this. I don't think it's a high-stakes decision one way or the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we put it in the same directory that the model is in we could have collisions with other tests from other models in the same directory.
Assigning myself to take this for another spin :) @gshank If we decide this is good-to-go from a user/product perspective, how much reworking would you want to do before putting this up for formal review of the code/implementation? |
If we want to start out by just supporting models, I think it's good to review. There might eventually be some additional work related to some of the upcoming version project relating to checking that this is the only yaml for a group of versions. |
Oh, I was wondering about the location of the new "yaml_config_dict" key. I put it in the node, but it could go somewhere else like the file object. |
def has_yaml_frontmatter(content: str) -> bool: | ||
"""Check first line for yaml frontmatter""" | ||
|
||
return content.startswith(FRONTMATTER_CHECK[0]) or content.startswith(FRONTMATTER_CHECK[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think content
could start with whitespace (including newlines) and this method would return False
, while split_yaml_frontmatter
would skip over leading whitespace.
Perhaps this would be better?
return content.startswith(FRONTMATTER_CHECK[0]) or content.startswith(FRONTMATTER_CHECK[1]) | |
stripped = content.lstrip() | |
return stripped.startswith(FRONTMATTER_CHECK[0]) or stripped.startswith(FRONTMATTER_CHECK[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content should be stripped already, I think, when it's loaded in read_files. Did you actually see an error? Wouldn't hurt to double up on that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! I honestly just drove by and looked over your approach. Cool that it's already stripped. It might be worth a test case?
This pull request is currently waiting for a new mashumaro release, in order to filter out the yaml_config_dict attribute in artifacts. |
@gshank is there any chance this might slip in with 1.7? Been writing a lot more dbt-sql these days and keep coming back to how much I would love yaml frontmatter! |
Unfortunately we are right on top of code freeze for 1.7. I'll update the branch though, so we can see what's involved. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #7100 +/- ##
==========================================
- Coverage 86.43% 85.14% -1.29%
==========================================
Files 176 176
Lines 26009 26089 +80
==========================================
- Hits 22480 22213 -267
- Misses 3529 3876 +347
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
Just for the record, I was just thinking about this, came up with exactly the idea to implement YFM and eventually found the disucssion, issue and finally, very gladly, this PR! I would really love to see this getting released and for that, I'd be happy to contribute! |
resolves #7099
Description
This is a work in progress, not at all ready for code review. It's here for commenting and further development.
The changes in core/dbt/clients/yaml_helper.py came from a pull request by jakebiesinger (jakebiesinger#1)
Checklist
changie new
to create a changelog entry