-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[yaml] Add Beam YAML Examples and Getting started docs #30003
Conversation
R: @robertwb |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
sdks/python/apache_beam/yaml/examples/test/map_to_fields_callable_test.py
Outdated
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/test/regex_matches_yaml_test.py
Outdated
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/transforms/aggregation/combine_sum.yaml
Show resolved
Hide resolved
sdks/python/apache_beam/yaml/examples/transforms/elementwise/explode.yaml
Outdated
Show resolved
Hide resolved
5a3b140
to
712c9ae
Compare
712c9ae
to
9af22dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good.
Let's remove the docs changes as we're moving those to the site and get these examples in.
sdks/python/apache_beam/yaml/examples/transforms/aggregation/combine_sum.yaml
Show resolved
Hide resolved
Any update on this? |
9e9bf7b
to
3dcd076
Compare
@robertwb All comments addressed and rebased on master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments, then LGTM.
### Element-wise | ||
These examples leverage the built-in mapping transforms including `MapToFields`, | ||
`Filter` and `Explode`. More information can be found about mapping transforms | ||
[here](../docs/yaml_mapping.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that they're live, let's point to the official docs on https://beam.apache.org/documentation/sdks/yaml/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pointed to UDF section since that is where MapToFields lives
# see https://cloud.google.com/docs/authentication/external/set-up-adc for more | ||
# information | ||
# | ||
# This pipeline reads in a text file, maps all words to a value of "1", sums |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps intersperse these comments with the code itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robertwb I refactored the example a bit to make it follow the logic more semantically. It also outputs Row(word=..., count=...)
instead of Row(output="word: count")
Let me know what you think
0f08592
to
c36b68a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good once we fix the failing tests and missing license headers.
99d4115
to
94488af
Compare
Looks like you need to add an apache license preamble to sdks/python/apache_beam/yaml/examples/README.md to make RAT happy. |
Are the other tests known issues? I don't see how my PR is affecting huggingface and rowcoder tests |
Perhaps rebasing will get to green if there were issues on |
I kicked off some re-runs, but I'm not sure if it always rebases on head. +1 to merging with master and re-pushing if this doesn't get things green. |
I think it does not. The various merge and HEAD commits associated with a PR are, I think, only updated when the head commit of the PR changes. So reruns will be against the same commit. (I could have this wrong, of course) |
0dd6cf8
to
c9fc5e7
Compare
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
c9fc5e7
to
7415134
Compare
Signed-off-by: Jeffrey Kinard <[email protected]>
7415134
to
7012d72
Compare
Signed-off-by: Jeffrey Kinard <[email protected]>
Signed-off-by: Jeffrey Kinard <[email protected]>
f134f97
to
d5d9447
Compare
Signed-off-by: Jeffrey Kinard <[email protected]>
d5d9447
to
8783561
Compare
Finally got the tests green. The I adjusted the test to run |
Thanks. Nice to finally get this in. |
Signed-off-by: Jeffrey Kinard <[email protected]>
Please add a meaningful description for your change here
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.