Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SizeLimitExceededException thrown while sending a small message to GCP PubSub #27178

Closed
1 of 15 tasks
jkosternl opened this issue Jun 20, 2023 · 4 comments
Closed
1 of 15 tasks
Assignees

Comments

@jkosternl
Copy link

What happened?

Since Beam v2.48 (Java), there is a bug introduced in the logic that verifies the total size of the message send to PubSub. The message size is checked against the amount of messages inside a batch (100). In v2.47 and before this was working fine.

Effect of the bug
Currently, in v2.48, it is not possible to use PubsubIO class to send messages to GCP PubSub from a Pipeline. It will cause a SizeLimitExceededException thrown like this:
"Pubsub message of length XXX exceeds maximum of 100 bytes, when considering the payload and attributes. See https://cloud.google.com/pubsub/quotas#resource_limits "

Cause
The bug is caused by commit specifically file sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PreparePubsubWriteDoFn.java at line 100, committed by @reuvenlax

Solution
In Beam v2.47, a similar check was performed against maxPublishBatchByteSize in file PubsubIO which is totally fine. So, please use the idea of comparing bytes with bytes, as was done there.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@chamikaramj
Copy link
Contributor

This seems like a regression for 2.48.0 caused by #26063. @reuvenlax can you check ?

@chamikaramj
Copy link
Contributor

Also making this a blocker for the 2.49.0 release.

@chamikaramj chamikaramj added P1 and removed P2 labels Jun 22, 2023
@chamikaramj chamikaramj added this to the 2.49.0 Release milestone Jun 22, 2023
@chamikaramj
Copy link
Contributor

Ah, seems like it was already fixed: #27000

Closing this issue. Please re-open if the issue was not addressed by the above.

@jkosternl
Copy link
Author

Ah yes, you're right. I didn't find that bug while searching, unfortunately. Thanks @chamikaramj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants