Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JobWorker.PollJobs throws System.ObjectDisposedException - Safe handle has been closed. #245

Open
terjeinnerdal opened this issue Apr 14, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@terjeinnerdal
Copy link
Contributor

terjeinnerdal commented Apr 14, 2021

Describe the bug
After a network issue in one of our services, we got a System.ObjectDisposedException when the worker tries to poll for new jobs. JobWorker.IsOpen() still returns true so we're not able to detect that the worker is in an unrecoverable state. The message in the exception is "Job polling failed".

System.ObjectDisposedException: Safe handle has been closed.
Object name: 'SafeHandle'.
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
   at Grpc.Core.Internal.ChannelSafeHandle.CreateCall(CallSafeHandle parentCall, ContextPropagationFlags propagationMask, CompletionQueueSafeHandle cq, String method, String host, Timespec deadline, CallCredentialsSafeHandle credentials)
   at Grpc.Core.Internal.AsyncCall`2.CreateNativeCall(CompletionQueueSafeHandle cq)
   at Grpc.Core.Internal.AsyncCall`2.Initialize(CompletionQueueSafeHandle cq)
   at Grpc.Core.Internal.AsyncCall`2.StartServerStreamingCall(TRequest msg)
   at Grpc.Core.Calls.AsyncServerStreamingCall[TRequest,TResponse](CallInvocationDetails`2 call, TRequest req)
   at Grpc.Core.DefaultCallInvoker.AsyncServerStreamingCall[TRequest,TResponse](Method`2 method, String host, CallOptions options, TRequest request)
   at Grpc.Core.Interceptors.InterceptingCallInvoker.<AsyncServerStreamingCall>b__5_0[TRequest,TResponse](TRequest req, ClientInterceptorContext`2 ctx)
   at Grpc.Core.ClientBase.ClientBaseConfiguration.ClientBaseConfigurationInterceptor.AsyncServerStreamingCall[TRequest,TResponse](TRequest request, ClientInterceptorContext`2 context, AsyncServerStreamingCallContinuation`2 continuation)
   at Grpc.Core.Interceptors.InterceptingCallInvoker.AsyncServerStreamingCall[TRequest,TResponse](Method`2 method, String host, CallOptions options, TRequest request)
   at GatewayProtocol.Gateway.GatewayClient.ActivateJobs(ActivateJobsRequest request, CallOptions options)
   at GatewayProtocol.Gateway.GatewayClient.ActivateJobs(ActivateJobsRequest request, Metadata headers, Nullable`1 deadline, CancellationToken cancellationToken)
   at Zeebe.Client.Impl.Commands.JobActivator.SendActivateRequest(ActivateJobsRequest request, Nullable`1 requestTimeout, Nullable`1 cancellationToken)
   at Zeebe.Client.Impl.Commands.ActivateJobsCommand.Send(Nullable`1 timeout, Nullable`1 cancellationToken)
   at Zeebe.Client.Impl.Misc.TransientGrpcErrorRetryStrategy.DoWithRetry[TResult](Func`1 action)
   at Zeebe.Client.Impl.Commands.ActivateJobsCommand.SendWithRetry(Nullable`1 timespan, Nullable`1 cancellationToken)
   at Zeebe.Client.Impl.Worker.JobWorker.PollJobs(ITargetBlock`1 input, CancellationToken cancellationToken)
   at Zeebe.Client.Impl.Worker.JobWorker.<>c__DisplayClass16_0.<<Open>b__2>d.MoveNext()
   --- End of inner exception stack trace ---

To Reproduce
Steps to reproduce the behavior:

  1. Start a worker
  2. Start a workflow instance
  3. Cut the connection to the cloud and restore after a while

Expected behavior
That the JobWorker handles this internally or that the JobWorker.IsOpen() returns false so that a new worker may be registered,

Enviroment (please complete the following information):

  • Zeebe running in Camunda Cloud
  • OS: OpenShift, Linux, Alpine, .net core 3.1
  • zb-client 0.19.0

Additional context
The worker is registered like this:

_zeebeClient.NewWorker()
                    .JobType(workerName + "." + jobType)
                    .Handler(jobHandler)
                    .Name(workerName)
                    .MaxJobsActive(10)
                    .PollInterval(TimeSpan.FromSeconds(1))
                    .Timeout(TimeSpan.FromSeconds(10))
                    .PollingTimeout(TimeSpan.FromSeconds(30))
                    .Open();
@terjeinnerdal terjeinnerdal added the bug Something isn't working label Apr 14, 2021
@ChrisKujawa
Copy link
Collaborator

Thanks @terjeinnerdal for reporting. I try to take a look as soon as possible, but please understand that we are currently preparing 1.0 for zeebe so it might take some time.

@ChrisKujawa
Copy link
Collaborator

Hey @terjeinnerdal I'm wondering whether your ZeebeClient was disposed during the network outage? Can this be the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants