-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to configure a connection with duplicate streams from different namespaces #105
Comments
I had the exact same problem with Snowflake to BigQuery where the snowflake schema is the namespace. In my case, the workaround I found is to create a source per namespace by specifying the It makes the provider mostly unusable at scale. |
Yep, we came to the same conclusion. But some of our source databases have 100+ schemas, and if I'm not mistaken, creating a connection per schema requires 100 replication slots, which essentially kills the source cluster. |
I’ve hit the exact same problem as I have circa 50 identical schemas in Postgres to sync :( |
I would like to add that this was doable through Octavia (as I had many identical Postgres streams all being synced through the one pipeline). Yes, the yaml file had some repetition but it was actionable. Is there any plans to introduce this functionality? |
It seems that there is a limitation of public-api.
and you're able to send it to the backend. For example, in the browser, I can see that the POST payload to the: looks like this: At the same time, the public API has no same parameters for that: The object "configurations[].streams[]" has "name", "syncMode", "cursorField", "primaryKey" and selectedFields parameters only. Have no idea why the public-api is cut compared with server-api. Have to submit this issue to the Airbyte Platform and its public-api functionality. |
Thanks so much for your help, and clear response, @gingeard Ticket raised with Airbyte: airbytehq/airbyte#47140 |
I'm configuring a connection with Terraform from Postgres to S3. The database has multiple identical schemas (namespaces), so the table names (stream_names) are not unique:
In the destination config I have
s3_path_format
set as${STREAM_NAME}/${NAMESPACE}/${YEAR}_${MONTH}_${DAY}_
, so the data in S3 would be partitioned by the source schema.However, I cannot configure the connection resource properly since the
configurations.streams
block does not have a schema/namespace attribute. If I include just one entry fortable_1
like this:then once deployed the connector has just one stream from one of the source schemas.
When I enabled two streams for
table_1
in the UI and ran theapply
again, the plan showed me this:So I added a duplicate stream for
table_1
and then the plan passed, but apply failed with an errorStatus 400 │ {"detail":"The body of the request contains an invalid connection configuration. Duplicate stream found in configuration for: │ table_1.","type":"https://reference.airbyte.com/reference/errors","title":"bad-request","status":400}
How can I address this? Did I miss any configuration in the docs or is this not supported?
The text was updated successfully, but these errors were encountered: