-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: potentially add xmin replication? #219
Comments
Interesting! I had someone else ask about this as well recently, seems like we could definitely do this. Like you said discovery would need to pull these additional columns, we could just add the two explicitly. @qbatten if you overrode the schema for the table you're trying to pull (annoying but worth a shot) and added xmin I wonder if it'd magically work. |
Hm okay there is a lot of complexity here. On your suggestion, I added an extra to the extractor for xmin (as in "meltano.yml code block" below). This got us a step further in that it did get meltano to try and use xmin in the select statement. However, postgres complains about it. Specifically it complains that there is no ordering operator for datatype xid. That makes sense bc xmin's data type is xid and ordering on xid's is not straightforward. We could cast it to text and then postgres wont complain but the behavior of order by xmin::text is not gonna be what we want, I believe. Looks like xmin's updates have fairly complex behavior as well (of course, I guess. link). Other useful links I ran into: some vaguely related discussion, txid docs, oid docs). All this now has me wondering how airbyte handles this. meltano.yml code block - name: my_pipeline-0
inherit_from: tap-postgres-my_pipeline
schema:
public-my_table:
xmin:
type: ["int", "null"] Error thrown in meltano
|
Ah, here's the meat of Airbyte's xmin-handling code |
Nice work @qbatten , and good find for sure. So the conversion we could do here https://github.com/MeltanoLabs/tap-postgres/blob/main/tap_postgres/client.py#L231-L232 , just hardcode if the column name is xmin then do the conversion 🤷 and the logic here. Those writeups you found are good too, and if we really want this feature we should add them in as warnings. It seems like xmin isn't the best incremental key, but I'd guess for some folks it's better than nothing with very large tables |
Airbyte has this as of July 10, see here. Haven't spent time scoping this out but doesnt work on current version of this plugin. I suspect this'd be easy to add, I assume the catalog is excluding hidden/system columns and it could just include them(?)
Specifically, I tried to just set
xmin
as the replication column, and it looks like it errors saying hey that column isnt in the catalog.The text was updated successfully, but these errors were encountered: