-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix to auto project the state forward when meas timeout (#798) #819
base: ros2
Are you sure you want to change the base?
Conversation
@@ -736,7 +736,7 @@ void RosFilter<T>::integrateMeasurements(const rclcpp::Time & current_time) | |||
|
|||
// If we get a large delta, then continuously predict until | |||
if (last_update_delta >= filter_.getSensorTimeout()) { | |||
predict_to_current_time = true; | |||
predict_to_current_time_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the existing logic is the same in ROS 1, and that works absolutely fine, even at very high frequencies.
The aim of this logic is:
- If the user has
predict_to_current_time
set totrue
, we always predict to the current time - If there is a large time delay between measurements, then just for that cycle, we predict to the current time so that our state estimate doesn't time out
Your change breaks this logic: if the user's sensor data ever experiences a large gap, it will permanently set predict_to_current_time_
to true
.
Is there a reason you can't just set predict_to_current_time
to true in your configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes the aim of the logic is very nice, but the big issue is in this second bullet point :
If there is a large time delay between measurements, then just for that cycle, we predict to the current time so that our state estimate doesn't time out
I completely agree with the aim of that, but from implementation aspect it is very problematic with data that may have tiny random delays measurements used with the filter running at high frequencies. I have this problem already. switching continuously between predict to current time and no between successive cycles introduce more delays and introduce more problem to project the state forward and by updating the last measurements time, as here the filter will slow down too much from 100 hz to somehow 10 hz and in somecases completely stuck. As in some cases, you may have meas delays for half cycle or one cycle or for 2 successive cycles (speaking about 0.005 sec , 0.01 sec, 0.02 sec) when running in range of 100 hz and some time the delay can be as tiny as 0.5 ms. All these cause the problem when just predicting to current time for one cycle and then wait to see for next cycle if there is delay to predict to current time for one more time.
yes, in my proposed fix, when a meas timeout is detected, the predict to current time is set to true permanently and the filter keep projecting the state forward always but this is not problematic as it may seems. The filter continue always predict/update cycles and integrating the incoming measurements as it should, and the filter is kind of always in ready state to keep projecting the state forward when meas delay or no meas for sometime, rather than keep switching to do it for one cycle or not depending on measurements that have random delays compared to the estimation cycle time period.
For sure, in this case the filter keep producing estimates by projecting the state forward even when there is very large gap between measurements and yes for very big timeout the filter will deviate and the covariances keep growing gradually without bound. This is a very expected behavior from a kalman filter and here is up to the user what to do in his parallel interface depending on his application with those situations. For example, in my case I have a parallel node running with the package that do those safety checks when there is big timeout and what to do with the state estimation results , or when meas return causing severe jumps to smoothen that by adaptive covariance algorithms and so...
I have done this fix in the pull request as I was interested for the filter to not stuck and get lagged and lock up suddenly and stop producing any outputs when there is very tiny measurements delays and that the filter cannot recover to its normal working state cycle as this can be very problematic for all the software stack using the state estimation outputs.
For me, based on that proposed fix everything is running robustly and reliably with the desired results in different demanding scenarios and with challenging data all running at 100 hz and above.
Is there a reason you can't just set predict_to_current_time to true in your configuration?
No, I don't have any issue at all when set this initially in my config as this what I was doing before. This has same effect as my fix as both set the param permanentaly, with one difference that when set in the config, the filter will always at startup keep projecting the state forward if there is a timeout or no, while in my fix it will always set that permanently when only a timeout is detected.
In the end both have same effects, it is matter of safety and preferences. For me, I prefer that the filter do that automatically for me when a timeout is detected rather than checking always if it is set in the config initially or not and causing serious problem during operation in different scenarios, as also this param was not by default in the config and need from the user to know about it and introduce it there, if facing any similar issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as here the filter will slow down too much from 100 hz to somehow 10 hz and in somecases completely stuck
Do you have a firm understanding of why that would happen? Do you have a bag that can replicate this behavior? Something tells me we're papering over the real issue with this change. I've run the ROS 1 version at 100 Hz with 5 IMUs reporting at the same rate, and have never had so much as a hiccup. We really need to provide data that can replicate the issue. Otherwise, I'm reluctant to accept that the filter would completely stop unless I have the exact code path that would lead to that condition.
The filter continue always predict/update cycles and integrating the incoming measurements as it should, and the filter is kind of always in ready state to keep projecting the state forward when meas delay or no meas for sometime, rather than keep switching to do it for one cycle or not depending on measurements that have random delays compared to the estimation cycle time period.
No, this is a problem. There are many applications where we only want the filter to produce output when it receives measurement, and not use the kinematic model to predict to the current time stamp.
The bottom line is that this changes the filter behavior for a very large number of users, just to suit your use case. If you want to add a parameter called predict_to_current_latch
or something (along with the relevant documentation), that would be fine. Then you can change the logic above to be something like:
bool predict_to_current_time = predict_to_current_time_;
...
if (last_update_delta >= filter_.getSensorTimeout()) {
predict_to_current_time = true;
if (predict_to_current_time_latch_)
{
predict_to_current_time_ = true;
}
this fix address the issue #798 when the filter lock up at high frequency if timeout is detected in measurements.
Due to an issue in the declaration of "predict_to_current_time" parameter, the filter was not able to automatically project the state forward and continue estimation which cause the filter to lock up. This issue can happen at any frequency if timeout is detected, but for sure at high frequency is more likely to happen due the cycle time and the big probability to capture the timeout.
Now the filter can run reliably at any low/high frequency without any issue, even if the param "predict_to_current_time" is set initially in the config or not.