Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine fatal error configuration. #423

Open
wants to merge 1 commit into
base: melodic-devel
Choose a base branch
from

Conversation

windelbouwman
Copy link

This refines selection of fatal en non-fatal errors.

There was no way to select BUSOFF as non fatal. My use case would be to select BUSOFF as non fatal and configure the can device with a restart-ms.

@mathias-luedtke
Copy link
Member

There was no way to select BUSOFF as non fatal.

It was written this way on purpose ;)

My use case would be to select BUSOFF as non fatal and configure the can device with a restart-ms

Does this really work with this fix?
(I did not have a chance to test it)

@@ -244,10 +254,13 @@ class SocketCANInterface : public AsioDriver<boost::asio::posix::stream_descript
input_.id = frame_.can_id & CAN_EFF_MASK;
input_.is_error = 1;

if (frame_.can_id & fatal_error_mask_) {
if (frame_.can_id & error_mask_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this change..
This triggers the state callback on every error message and might a little bit noisy in general

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that error_mask_ is used to test for errors which must be logged. The fatal_error_mask_ is intended for errors which must put the driver in not ready mode (using setNotReady). Is this correct?

I like this idea, apart from the recovery strategy, users can configure which errors should be fatal. Another idea would be to simply log all errors, en only allow to configure which errors are fatal.

The concept of "fatal" being the behavior that the driver is put into not ready mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This triggers the state callback on every error message and might a little bit noisy in general

Error's should only happen every now and then (arbitration error, ack error). If there are many error's on the bus, it's probably okay to log them? Another strategy could be to rate limit the amount of errors logged, and summarize them, but this would be more work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, the reason why this change is important, is that in case of an error, every now and then, the driver hangs on "failed to send message".

Now that I think about it, in the current implementation all errors, except the BUS_OFF error can be configured to be ignored, right? Setting the whole bunch of parameters to false will do the trick!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that error_mask_ is used to test for errors which must be logged.

error_mask entry will be reported to the error frames callback!

We could add a logging_mask for this purpose, which could default to error_mask.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in favor of adding extra configration. I suggest to move the logging inside the fatal error clause?

I guess the error frame callback is call by calling setInternalError?

@windelbouwman
Copy link
Author

It was written this way on purpose ;)

Why would the BUS_OFF error be a special case? I think it's fair to treat it as the other errors. What will happen with BUS_OFF, is that the output queue will fill up, and eventually the send asio call will fail / block.

@mathias-luedtke
Copy link
Member

What will happen with BUS_OFF, is that the output queue will fill up, and eventually the send asio call will fail / block.

In our usecase, the output queue will fill up immediately and then the kernel driver would close the socket..

Why would the BUS_OFF error be a special case? I think it's fair to treat it as the other errors

Agreed.
Again, I did not test this special case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants