-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Python SDFs (e.g. PeriodicImpulse) running in Flink and polling using tracker.defer_remainder have checkpoint size growing indefinitely #27648
Comments
@Abacn thanks for sharing the other issues for Flink SDF checkpoint support on Python SDK. I'm not sure what you're asking me though.. Are you asking me to create a fix for this issue? I can try to do that, but I'll need some guidance, as I am not too familiar with the either Beam or Flink codebases. |
Thanks @nybbles the findings in the issue description is already very specific and clearer than what I know before in terms of the root cause. Happy to help from the beam side |
fyi Java Portable Runner has the same issue (submit a Java PeriodicImpulse using PortableRunner via FlinkJobService). Python flink runner is just a wrapper to use the job service jar. So the fix should be in Java portable runner code side.
added |
Awesome thank you that's a helpful pointer. Do you mean that the cause is likely to be somewhere here? https://github.com/apache/beam/tree/master/runners/portability/java/src/main/java/org/apache/beam/runners/portability I.e. the cause is before the pipeline is translated into protobuf and sent to https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java#L37? |
I am not very familiar with, one thing could note is that the splittable DoFn support is missing for "portable" Flink runner job, which is the case of Python SDK jobs. Line 731 in af533fe
Maybe one start point is here. Will dig into. |
caused by #19637 |
That's really concerning that splittable DoFn support is missing for the portable Flink runner job, as this Beam/Flink capability matrix implies the opposite: https://beam.apache.org/documentation/runners/capability-matrix/unbounded-splittable-dofn-support-status/. |
I am testing Java portable runner and is making a little bit progress
in PortableRunner.run(), one is able to get GenerateSequence working for portable Flink runner. Same should happen for KafkaIO @nybbles would you mind testing your kafka pipeline with option |
@Abacn thanks for continuing to dig into this! I'm not actually using KafkaIO. I wrote an SDF to read from a Redis stream. Can I simply pass the option you mentioned in to test, or is there anything rust I should do for my Redis stream SDF? I wrote this Redis stream SDF using the Python SDK. |
Here's roughly what the code for my |
Hi @nybbles, I see, for pure SDF this option ( Nevertheless given that UnboundedRead execution is not broken on portable flink streaming, one should be able to fix |
I can confirm that it is working properly ! The way to make it works is to set some expansion service params like:
However I have still some problem to make it works in the flink k8s operator, because of beam artifacts resolution. But I tried to make a fix See #28068 |
I'm not using KafkaIO, but rather am running a pure SDF implemented using the Python SDK. |
@Abacn sorry for the silence; I needed to meet a deadline so I ended up finding a temporary solution (calling I also turned off checkpointing and am still seeing this issue (now it is Java heap space growth, proportional to the number of calls to I am taking a look at |
I took a heap dump from running my pipeline that uses a pure SDF to read from Redis (should have similar memory leak to the minimal example I posted that uses The retained memory is primarily in |
@Abacn I'm having trouble finding |
What happened?
Please see https://gist.github.com/nybbles/6e1f2ab31866b251ff754e22b71f8405 for code to replicate this problem.
Problem description
I am finding that using unbounded SDFs with Flink results in checkpoint sizes that grow without bound, which eventually results in the job failing and all further job submissions to fail, even for
beam.transforms.periodicsequence.PeriodicImpulse
in a very simple pipeline, given below.My pipeline consists of an SDF that reads from an unbounded source, which means that when there are no new messages, the SDF must poll the unbounded source, with some timeout. I observed that when my SDF would do this polling behavior (using
tracker.defer_remainder
as described in https://beam.apache.org/documentation/programming-guide/#user-initiated-checkpoint, the checkpoint size would grow.This happens even if the unbounded source was empty, and hence my SDF simply executed a loop of polling the unbounded source and then calling
tracker.defer_remainder
and returning from theDoFn
to relinquish control and wait to poll again.I was concerned that I had implemented my SDF or my pipeline incorrectly, so I found
beam.transforms.periodicsequence.PeriodicImpulse
and tested it in a very simple pipeline, which is as follows (note thatapply_windowing
's value does not change the problematic behavior):This pipeline also results in growing checkpoint size.
The Flink cluster configuration and full source for the program to replicate the problem and the Docker compose to get the Flink cluster up and running are given below in the reproduction steps and in https://gist.github.com/nybbles/6e1f2ab31866b251ff754e22b71f8405.
In case it is helpful, I'll list the
FLINK_PROPERTIES
andPipelineOptions
below.See this email thread for more context: https://lists.apache.org/thread/7yjr1f24rdzwzofdty1h12w9m28o62sm.
Note on priority
I followed the linked guide for setting issue priorities and set this one to priority 1 because it seems like unbounded SDFs is an important component, running on Flink is an important usecase, and having arbitrary checkpoint size growth makes unbounded SDFs on Flink non-functional. My apologies in advance if this is the wrong priority level.
Reproduction steps
environment_type="DOCKER"
).tracker.defer_remainder
), despite the SDF not actually explicitly accumulating any state.Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components
The text was updated successfully, but these errors were encountered: