-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add random row generator in data generator #451
Conversation
@andygrove would you mind to take a look at this? |
fields += data | ||
case _ => | ||
val generator = RandomDataGenerator.forType(f.dataType, f.nullable, r) | ||
assert(generator.isDefined, "Unsupported type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also check f.nullable
and return null
sometimes here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RandomDataGenerator.forType(f.dataType, f.nullable, r)
handles the nullable type, so I don't think we need to handle it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm should we do the same (create nulls in forType
) for the rest of types? Then we do not need duplicated PROBABILITY_OF_NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm should we do the same (create nulls in forType) for the rest of types?
Let me think about that. Maybe we can remove the array type case. However I think the user specified stringGen
still have to access PROBABILITY_OF_NULL
to generate nullable strings. Otherwise, the stringGen
method itself should handle nullable by itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, please let me know what you think.
…ndomDataGenerator.forType
LGTM overall. I think this is a good addition. I left a couple of small style suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @advancedxy
Thanks everyone for reviewing. |
* feat: Add random row generator in data gen * fix * remove array type match case, which should already been handled in RandomDataGenerator.forType * fix style issue * address comments (cherry picked from commit 9125e6a)
Which issue does this PR close?
Closes #.
Rationale for this change
Follow up of #426, supports generating random rows for specified struct type.
What changes are included in this PR?
Add utility method:
generateRows
to generate random rows.How are these changes tested?
N/A.