You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 12, 2021. It is now read-only.
Although the above start command throws a NullPointerException, the update is still added to the MemJobUpdateStore but not persisted to the log. We still call saveJobUpdate(...) within the ‘start(...)’ code which will add it to the memory stores. However, because a NullPointerException is thrown before the write lock is exited, these operations are never persisted to the log. The design of the storage system in the scheduler is transactional so everything is added to the log at the end of the write. Due to this, we are now in a state where the memory store does not match the log store.
I think that we should catch all unhandled exceptions within the write lock and immediately kill the scheduler. This would avoid errors leaving a potentially inconsistent state and corrupting the log preventing easy rollback.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
A finding from #31.
A user created an update to remove instances from a job. This throws a NullPointerException as mentioned in the issue above. The LoggingInterceptor actually swallows the exception. This happens because we do the initial evaluation of the update within the user calling the RPC method (follow along the start(...) method if you are not convinced).
Although the above start command throws a NullPointerException, the update is still added to the MemJobUpdateStore but not persisted to the log. We still call saveJobUpdate(...) within the ‘start(...)’ code which will add it to the memory stores. However, because a NullPointerException is thrown before the write lock is exited, these operations are never persisted to the log. The design of the storage system in the scheduler is transactional so everything is added to the log at the end of the write. Due to this, we are now in a state where the memory store does not match the log store.
I think that we should catch all unhandled exceptions within the write lock and immediately kill the scheduler. This would avoid errors leaving a potentially inconsistent state and corrupting the log preventing easy rollback.
The text was updated successfully, but these errors were encountered: