Skip to content

Commit

Permalink
Update BigQueryIO Documentation (#28591)
Browse files Browse the repository at this point in the history
* Update BigQueryIO Documentation

- Updated the description regarding failed rows for Storage Write API.
- Made `PCollection` formatting consistent.

* Update website/www/site/content/en/documentation/io/built-in/google-bigquery.md
  • Loading branch information
RyuSA authored Oct 9, 2023
1 parent 6046297 commit d170634
Showing 1 changed file with 6 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -883,10 +883,9 @@ explicitly enable this using [`withAutoSharding`](https://beam.apache.org/releas
***Note:*** Auto sharding with `STORAGE_WRITE_API` is supported on Dataflow's legacy runner, but **not** on Runner V2
{{< /paragraph >}}

When using `STORAGE_WRITE_API`, the PCollection returned by
[`WriteResult.getFailedInserts`](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html#getFailedInserts--)
will not contain the failed rows. If there are data validation errors, the
transform will throw a `RuntimeException`.
When using `STORAGE_WRITE_API`, the `PCollection` returned by
[`WriteResult.getFailedStorageApiInserts`](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html#getFailedStorageApiInserts--)
will contain the rows that failed to be written to the Storage Write API sink.

#### At-least-once semantics

Expand All @@ -901,10 +900,9 @@ specify the number of streams, and you can’t specify the triggering frequency.

Auto sharding is not applicable for `STORAGE_API_AT_LEAST_ONCE`.

When using `STORAGE_API_AT_LEAST_ONCE`, the PCollection returned by
[`WriteResult.getFailedInserts`](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html#getFailedInserts--)
will not contain the failed rows. If there are data validation errors, the
transform will throw a `RuntimeException`.
When using `STORAGE_API_AT_LEAST_ONCE`, the `PCollection` returned by
[`WriteResult.getFailedStorageApiInserts`](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html#getFailedStorageApiInserts--)
will contain the rows that failed to be written to the Storage Write API sink.

#### Quotas

Expand Down

0 comments on commit d170634

Please sign in to comment.