Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consumer.on('event.log', ...) spams "Handle is terminating: failed 0 request(s) in retry+outbuf" upon consumer.disconnect #531

Closed
atamon opened this issue Nov 26, 2018 · 6 comments

Comments

@atamon
Copy link
Contributor

atamon commented Nov 26, 2018

Thanks for providing a great nodejs library for Kafka!

Environment Information

  • OS [e.g. Mac, Arch, Windows 10]: Linux, through Docker on Kubernetes
  • Node Version [e.g. 8.2.1]: 10.2.1
  • NPM Version [e.g. 5.4.2]: 6.4
  • C++ Toolchain [e.g. Visual Studio, llvm, g++]: g++ (through your npm scripts)
  • node-rdkafka version [e.g. 2.3.3]: 2.3.3

Steps to Reproduce

  1. Connect a new KafkaConsumer (https://github.com/Yolean/kafka-cache/blob/master/lib/kafka.js#L158)
  2. Listen for librdkafka messages using conumer.on('event.log', ....) (https://github.com/Yolean/kafka-cache/blob/master/lib/kafka.js#L173)
  3. Disconnect the consumer after its usage (https://github.com/Yolean/kafka-cache/blob/master/lib/kafka.js#L201)

Logs get spammed with:

kafka-1.broker.kafka.svc.cluster.local:9092/1: Handle is terminating: failed 0 request(s) in retry+outbuf

Spammed meaning a few hundred messages per second, enough to take a significant amount of CPU, blocking normal operations for the process.

This occurs seemingly at random and is therefore hard to reproduce. But I was hoping to get some insights in what I might be doing wrong here.

node-rdkafka Configuration Settings
debug parameter set to broker.

Additional context

@webmakersteve
Copy link
Contributor

Is there a problem manifesting or are the logs just noisy? Does the disconnection get blocked?

How are you consuming? Are you using "flowing mode" or batch mode?

Broker logs are supposed to have to do with something in the interactions with the broker, and they can often be noisy. Since node is single threaded, getting too many of them will indeed block the node thread.

@atamon
Copy link
Contributor Author

atamon commented Dec 11, 2018

@webmakersteve Thanks for following up on this issue!

How are you consuming? Are you using "flowing mode" or batch mode?

Flowing mode.

Is there a problem manifesting or are the logs just noisy? Does the disconnection get blocked?

The service gets blocked in general. So the disconnection will likely get blocked too. That feels like a logical reason as to why this keeps on going "forever" too.

I had a similar problem yesterday but this time on my local dev-machine Yolean/kafka-cache#22. This had to me to believe that the issue was related with connection issues to the broker (as the broker locally was just being set up).

@webmakersteve
Copy link
Contributor

Flowing mode may be the cause of the problem here. I am likely going to get rid of it as a C++ land function when a higher order consumer is made in the future because of how many problems it causes and how hard it is to debug.

I would first recommend trying to use the stream, or calling consume manually, and see if that fixes what you're seeing in production.

@atamon
Copy link
Contributor Author

atamon commented Dec 12, 2018

Hmm yeah maybe the streaming API can support our current use-cases as well. Thanks for the clarifications! I'll close this issue as you gave me a few ideas to try out!

Cheers!

@atamon atamon closed this as completed Dec 12, 2018
@solsson
Copy link

solsson commented Dec 12, 2018

I am likely going to get rid of it as a C++ land function when a higher order consumer is made in the future because of how many problems it causes and how hard it is to debug.

@webmakersteve That's very interesting roadmap information. I think it could benefit both new users and this project's community to have only one way of doing things.

I'm not sure I understood the C++ land precondition, but how about launching a 3.x branch with APIs removed already? It would make the forward path clear and let us evaluate and help document how it affects use cases.

@webmakersteve
Copy link
Contributor

webmakersteve commented Dec 13, 2018

I do have a 3.x branch but it doesn't have those proposed changes just yet. Not much parallel development happens on this project since it's mostly me over here on this end, and of course all my wonderful contributors helping out otherwise 😄. But I'll take the suggestion to heart and try to get that going sooner than I originally planned to so the community can see the proposals!

Edit: Turns out that 3.x branch is only local. Will clean it up a bit and push it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants