-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow setting BigQuery endpoint #32153
Conversation
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
assign set of reviewers |
Assigning reviewers. If you would like to opt out of this review, comment R: @Abacn for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
86a5872
to
f6b612a
Compare
@Abacn @damondouglas could you please have a look? Basically, I just followed the same pattern as with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The change looks good, just left a few minor formatting comments
CHANGES.md
Outdated
@@ -68,6 +68,7 @@ | |||
## New Features / Improvements | |||
|
|||
* X feature added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)). | |||
* Added an ability to set BigQuery endpoint (Java) ([#28149](https://github.com/apache/beam/issues/28149)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we note that this is for testing purpose? Endpoint can be ambiguous (e.g. regional/zonal endpoint/ like Dataflow endpoint)
consider "BigQuery endpoint can be overridden via PipelineOptions, this enables BigQuery emulators (Java)"
@@ -1615,8 +1642,13 @@ private static BigQueryWriteClient newBigQueryWriteClient(BigQueryOptions option | |||
.setChannelsPerCpu(2) | |||
.build(); | |||
|
|||
BigQueryWriteSettings.Builder builder = BigQueryWriteSettings.newBuilder(); | |||
String endpoint = options.getBigQueryEndpoint(); | |||
if (endpoint != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a safer guard is Strings.isNullOrEmpty
@@ -1615,8 +1642,13 @@ private static BigQueryWriteClient newBigQueryWriteClient(BigQueryOptions option | |||
.setChannelsPerCpu(2) | |||
.build(); | |||
|
|||
BigQueryWriteSettings.Builder builder = BigQueryWriteSettings.newBuilder(); | |||
String endpoint = options.getBigQueryEndpoint(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String
-> @Nullable String
f6b612a
to
dc0cbab
Compare
Thanks @Abacn , addressed your comments |
Reminder, please take a look at this pr: @Abacn @damondouglas |
Hello, Thank you for this new feature. May I ask you @kberezin-nshl how you intend the user to use this new possibilities ? I've tried to use test container for my pipeline, but I still get the error message I use TestContainer and TestPipeline BigQueryEmulatorContainer container = new BigQueryEmulatorContainer("ghcr.io/goccy/bigquery-emulator:0.4.3");
String projectId = container.getProjectId();
org.apache.beam.sdk.io.gcp.bigquery.BigQueryOptions pipelineOptions = PipelineOptionsFactory.create().as(org.apache.beam.sdk.io.gcp.bigquery.BigQueryOptions.class);
pipelineOptions.setBigQueryEndpoint(container.getEmulatorHttpEndpoint());
BigQueryServicesImpl bqService = new BigQueryServicesImpl();
bqService.getJobService(pipelineOptions);
Session session = buildExpectedSession();
WriteResult writeResult = p.apply(Create.of(session))
.apply(ParDo.of(new ProtobufSessionToBigQueryTableRow()))
.apply(BigQueryIO.writeTableRows()
.withTestServices(bqService)
.to(row -> {
return new TableDestination(projectId+":***.***", null);
})); Thanks in advance for your help |
I have the same question as @wattache. I've tried the following:
Still get the following error:
|
I've updated on the previous example:
Now I'm getting an exception related to serialization:
|
Unfortunately, currently there is a bug causing the behavior that you're seeing. The fix for that has been merged already, so hopefully you'll be able to use this feature with the next release. In the meantime, it is possible to have the following workaround, first create this class: public class WorkaroundBQServices extends BigQueryServicesImpl {
private final String bigQueryEndpoint;
public WorkaroundBQServices(String bigQueryEndpoint) {
this.bigQueryEndpoint = bigQueryEndpoint;
}
@Override
public JobService getJobService(BigQueryOptions options) {
options.setBigQueryEndpoint(bigQueryEndpoint);
return super.getJobService(options);
}
@Override
public DatasetService getDatasetService(BigQueryOptions options) {
options.setBigQueryEndpoint(bigQueryEndpoint);
return super.getDatasetService(options);
}
@Override
public WriteStreamService getWriteStreamService(BigQueryOptions options) {
options.setBigQueryEndpoint(bigQueryEndpoint);
return super.getWriteStreamService(options);
}
@Override
public StorageClient getStorageClient(BigQueryOptions options) throws IOException {
options.setBigQueryEndpoint(bigQueryEndpoint);
return super.getStorageClient(options);
}
} then in your pipeline, set it as BigQueryIO.<...>write()
.withTestServices(new WorkaroundBQServices(options.getBigQueryEndpoint())) As I said, once new release come out, you should be able to remove this workaround. |
@kberezin-nshl Thanks for the update, I have a follow up issue with the use of emulator, now related to credentials:
I tried setting
|
Another question, maybe related to the previous, is what to set on |
* Allow setting BigQuery endpoint * Changes.md update
This fixes #28149
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.