-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a handful of perworkermetrics the bigquery sink (#28903). #29098
Conversation
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/DelegatingPerWorkerCounter.java
Outdated
Show resolved
Hide resolved
sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/DelegatingPerWorkerHistogram.java
Outdated
Show resolved
Hide resolved
...platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiFinalizeWritesDoFn.java
Outdated
Show resolved
Hide resolved
...atform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiFlushAndFinalizeDoFn.java
Outdated
Show resolved
Hide resolved
@@ -643,11 +650,20 @@ long flush( | |||
contexts -> { | |||
AppendRowsContext failedContext = | |||
Preconditions.checkStateNotNull(Iterables.getFirst(contexts, null)); | |||
Instant operationEndTime = Instant.now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this pattern all over the place, of grabbing the start time, updating a metric, grabbing error code, etc.
Is it possible to repackage this into a reusable utility method for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repacked updating the RpcRequests
and RpcLatency
metrics into utility methods since that are reused across all write methods. These updates are now handled by reportSuccessfulRpcMetrics
and reportFailedRPCMetrics
.
It's not straightforward to refactor updates AppendRowsRowStatus
into a utility. When we update this metric we need to know the number of rows that were failed/retried/succeeded so it's more embedded in the code.
89b9ecc
to
ba08501
Compare
This LGTM. I'm rerunning the tests before merge however. |
Retest this please |
That test failure is a flake, this is good to go! |
There's a single jenkins test failure but the corresponding Github actions for that test passes. |
Implement the following set of metrics in BigQuery's StorageWriteAPI transforms. These metrics will be stored in the PerWorkerMetrics container defined in (#28923)
1.
AppendRowsRowStatus
CounterMetric that tracks the status of BigQuery rows after making an AppendRows RPC call.
Metric has labels the following labels:
RowStatus
: Status of the BigQuery rows after , one ofSUCCESSFUL
,RETRIED
,FAILED
Status
: gRPC status of the AppendRows RPCTableId
: 'datasets/{ }/tables/{ }' that the rows are sent to.2.
ThrottledTime
CounterTracks the total time spent waiting between RPC retries.
Metric has labels the following labels:
Method
: BigQuery method that's causing the throttle. One ofAppendRows
,FlushRows
,FinalizeStream
.3.
RpcRequests
CounterTracks the Rpc Status of various BigQuery write methods.
Metric has labels the following labels:
Method
: BigQuery sink method. One ofAppendRows
,FlushRows
,FinalizeStream
.Status
: gRPC status of the RPCTableId
: 'datasets/{ }/tables/{ }' that the rows are sent to.4.
RpcLatency
Histogram:Tracks the Rpc Latency of various BigQuery write methods.
Metric has labels the following labels:
Method
: BigQuery sink method. One ofAppendRows
,FlushRows
,FinalizeStream
.Notes
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.