-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PlaceholderArray encountered in BitMaskedArray.to_ByteMaskedArray
when it shouldn't be
#524
Comments
Another indicator: passing |
Is it expected to be possible to do concatenate on typetracers? It should be needed, since we will need to know the columns to select from both input layers independently - we have no mechanism to carry the columns found needed for one layer across to another. So far I have found these edges:
|
So indeed, the second partition is receiving |
I can confirm that the one-pass branch successfully computes the second failing case, but only_the_first_time. Subsequent computes fail. The fail mode in having no required columns passed to parquet at all. Calling Given that #526 has a variant of this same problem, is it time to dust off the one-pass PR? |
(I should say, that a trivial, but not great, workaround for the issue here is to touch all inputs to a concatenate, which somehow is what the other linked issue ended up doing (because of axis=1, presumably). |
Here's a reproducer:
files.tar.gz
succeeds but
fails with
Going into more detail, the troublemaker is
self._mask.data
, which is a PlaceholderArray. The rehydration must be saying that this buffer is not needed, but it is needed. The concatenation needs to know which array elements are missing.The text was updated successfully, but these errors were encountered: