You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The write operation currently generates a COPY command like the following:
COPY "PUBLIC"."some_table"FROM's3://some-bucket/tmp/manifest.json' CREDENTIALS 'aws_access_key_id=__;aws_secret_access_key=__' FORMAT AS CSV NULLAS'@NULL@' manifest
This relies on the DataFrame to have the columns in the same order as the table if it already exists. However, the COPY command supports specifying column lists or JSONPath expressions to map columns (documentation). It would be nice to at least support the column list, potentially as an option on the write operation like:
So this hasn't been merged yet. Is there any other solution (short of making sure column order in the DataFrame matches the table)? I'd like to use CSV GZIP instead of AVRO, but can't append to existing tables because the CSV has a different order than the table.
Nope, I've just been making sure to run a final .select() to specify the column order on any of my DataFrame operations before writing to Redshift. Unfortunately, the connector is no longer being maintained, so it doesn't look like this will be merged... 😞
The write operation currently generates a
COPY
command like the following:This relies on the DataFrame to have the columns in the same order as the table if it already exists. However, the
COPY
command supports specifying column lists or JSONPath expressions to map columns (documentation). It would be nice to at least support the column list, potentially as an option on the write operation like:Looks like this should be fairly straightforward to add here.
The text was updated successfully, but these errors were encountered: