Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] - The Python DoOutputsTuple force Tagged PCollection to is_bounded=True #29116

Closed
ad-momo opened this issue Oct 24, 2023 · 2 comments
Closed

Comments

@ad-momo
Copy link

ad-momo commented Oct 24, 2023

Hello,

When you apply a beam.Partition on a unbounded PCollection, the tagged PCollections from DoOutputsTuple have is_bounded=True.

Normally, if the source is unbounded then the Pcollections from partition are unbounded too, no ?

pcoll = PCollection(self._pipeline, tag=tag, element_type=typehints.Any)

@ad-momo ad-momo changed the title [PYTHON] - DoOutputsTuple force Tagged PCollection to is_bounded=True [BUG] - DoOutputsTuple force Tagged PCollection to is_bounded=True Oct 24, 2023
@ad-momo ad-momo changed the title [BUG] - DoOutputsTuple force Tagged PCollection to is_bounded=True [BUG] - The Python DoOutputsTuple force Tagged PCollection to is_bounded=True Oct 24, 2023
@ad-momo ad-momo changed the title [BUG] - The Python DoOutputsTuple force Tagged PCollection to is_bounded=True [Bug] - The Python DoOutputsTuple force Tagged PCollection to is_bounded=True Oct 24, 2023
@ad-momo
Copy link
Author

ad-momo commented Oct 25, 2023

Hello,

A workaround waiting for support response :

# The Taggedoutput have to have same is_bounded state like the source
initial_get_item = DoOutputsTuple.__getitem__


def new_get_item(self: DoOutputsTuple, tag: int | str | None):
    tag_not_in_pcolls = tag not in self._pcolls

    pcoll = initial_get_item(self, tag)

    if tag_not_in_pcolls:
        assert self.producer is not None
        pval = self.producer.parts[0].outputs[None]

        # pass bounded state
        pcoll.is_bounded = pval.is_bounded

    return pcoll


DoOutputsTuple.__getitem__ = new_get_item

@ad-momo ad-momo closed this as completed Oct 30, 2023
@ad-momo
Copy link
Author

ad-momo commented Oct 30, 2023

Re-create with report issue template #29196

@github-actions github-actions bot added this to the 2.52.0 Release milestone Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant