-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include Column List option #343
base: master
Are you sure you want to change the base?
Conversation
…mns to the COPY command
This looks really useful. It would be really handy if we could have an option to ignore certain columns, eg columns that use IDENTITY, or columns with default values such as timestamps. That would solve #381 |
This would be very useful for our purposes and would make loading data into Redshift using this package far more robust for us. |
@toddwildey Unfortunately this package is dead and they haven't accepted or merged any PRs on it in 2 years. We've decided to decouple our Spark processes from Redshift and handle the Redshift data access in a different layer and move away from this package since it's no longer maintained and got no traction from what I could tell on a potential fork. |
Hi @kyrozetera I'm using a (maintained) fork of this library and this patch is just what I need. Would you be interested in contributing this patch to the fork? It applies without issue once the file paths are updated (see tweaked diff attached). If you'd rather not be bothered, I'd be happy to make the contribution on your behalf. |
@eeshugerman Sorry, I was out of town for the weekend so wasn't able to look at this until now. Looks like you've made the PR though so 👍 |
Which version of the package can this change be found in ? |
I'm using pyspark --packages io.github.spark-redshift-community:spark-redshift_2.11:4.0.1 to test this change but consistently getting "Delimiter not found" error logged in stl_load_errors. |
Did you set |
Closes #340