-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Date type in HCatToRow #32695
Conversation
Some initial notes: - The issue (apache#20685) deals with java.sql.Date, which I wasn't able to reproduce fully (I can currently write hcatalog hadoop.hive date) - On this note, 267f76f changed the code involved so that there's a direct cast to AbstractInstant in RowUtils.java. This doesn't change much, but jfyi.
Assigning reviewers. If you would like to opt out of this review, comment R: @kennknowles for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
HCatToRowFn(Schema schema) { | ||
this.schema = schema; | ||
} | ||
|
||
@ProcessElement | ||
public void processElement(ProcessContext c) { | ||
HCatRecord hCatRecord = c.element(); | ||
c.output(Row.withSchema(schema).addValues(hCatRecord.getAll()).build()); | ||
List<Object> recordValues = | ||
hCatRecord.getAll().stream().map(this::castHDate).collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we generalize the function, e.g. just naming as "castTypes", leaving space for future improvement if other type need a conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for the suggestion! I moved this whole logic to a separate new util castTypes
. Please let me know if you simply meant s/castHDate/castTypes
though!
I simply use That wiki is about configure |
- s/castHDate/maybeCastHDate/ to be more concise - move values manipulation to a separate util (hopefully, I understood the cr in the right way)
Nice, thanks so much! I found this command, but forgot to update the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
* Handle Date type in HCatToRow Some initial notes: - The issue (apache#20685) deals with java.sql.Date, which I wasn't able to reproduce fully (I can currently write hcatalog hadoop.hive date) - On this note, 267f76f changed the code involved so that there's a direct cast to AbstractInstant in RowUtils.java. This doesn't change much, but jfyi. * Run: ./gradlew :sdks:java:io:hcatalog:spotlessApply * review cr: castTypes util - s/castHDate/maybeCastHDate/ to be more concise - move values manipulation to a separate util (hopefully, I understood the cr in the right way)
This addresses #20685.
Implementation notes:
Date
values toInstant
's in HCatToRow.java per my understanding of damccorm's suggestions in the issue above.testReadHCatalogDateType
by writing a test hcatalog with a date column, reading it then, and converting the HCatRecord collection to a Row collection.I aimed to make the least invasive change, since the issue appears to be specific to hive.
Some other notes:
java.sql.Date
, which I wasn't able to reproduce fully. I could only write hcatalog withorg.apache.hadoop.hive.common.type.Date
, so I'm not really sure if we wanna account forjava.sql.Date
case as well.AbstractInstant
in RowUtils.java. This doesn't change much, but jfyi.Checklist to help us incorporate your contribution quickly and easily:
CHANGES.md
with noteworthy changes (I don't think it's that noteworthy)See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.