-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Python sdk harness failed: TypeError: can only concatenate str (not "NoneType") to str #28131
Comments
When I comment out the windowinto line the error goes away but the pipeline still doesn't function as expected -- which might be an issue with my custom DoFn with beam.Pipeline(options=options) as pipeline:
messages = (
pipeline
| f"Read from input topic {subscription_id}" >>
beam.io.ReadFromPubSub(subscription=subscription_id,
with_attributes=False)
| f"Deserialize Avro {subscription_id}" >> beam.ParDo(
ConfluentAvroReader(schema_registry_conf)).with_outputs(
"record", "error"))
records = messages["record"]
errors = messages["error"]
(records
| 'Aggregate msgs in fixed window' >> beam.WindowInto(beam.window.FixedWindows(15))
| 'Send hardcoded value to datadog' >> beam.ParDo(SendToDatadog())
| 'Print results' >> beam.Map(print)
) |
The error happens in the SDK internals and is rather strange. It sounds as though a runner sends a malformed request to the SDK. I would suggest you try again and if the issue still persist provide a minimal pipeline that reproduces it, that we could try out, or work with Dataflow customer support. |
Thank you for the feedback! The issue has persisted -- I'm working on a minimal pipeline that can reproduce the error now |
Hi, any news about the repro? Thanks! |
Thanks for the followup! I ended up going a different route, by removing the datadog module from the pipeline and setting up a separate container to relay messages from pubsub to datadog. It feels like beam/dataflow is structured around data transformation -- trying to coerce it to do external API calls seems like a dumb idea in hindsight (especially on an unpeered-vpc) |
What happened?
I'm trying to send custom metrics to datadog using a DoFn but the python sdk test harness is failing with an error that I don't know how to interpret:
I'm running the streaming pipeline as a flex template in gcp dataflow, using the python requests module for POST calls
Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: