Skip to content

Commit

Permalink
Update logging for failed async operations (radius-project#7370)
Browse files Browse the repository at this point in the history
# Description

Controllers currently have two ways of bubbling up errors to the async
worker:
1. By returning the error, which is then added to the result by the
async worker.
2. By directly returning a non-empty result with error message populated
and provisioning state set to "failed".

We have a few gaps in the logging today which results into false data or
no data, this PR aims to fix these gaps -
* Error is not logged if it is returned as a part of the result
(scenario 2 above).
* Operation is logged as successful if a direct error is returned
(scenario 1 above).
* Provisioning state is set to succeeded even if the controller returns
an error (scenario 1 above).

## Type of change

<!--

Please select **one** of the following options that describes your
change and delete the others. Clearly identifying the type of change you
are making will help us review your PR faster, and is used in authoring
release notes.

If you are making a bug fix or functionality change to Radius and do not
have an associated issue link please create one now.

-->

- This pull request fixes a bug in Radius and has an approved issue
(issue link required).

<!--

Please update the following to link the associated issue. This is
required for some kinds of changes (see above).

-->

Part of radius-project#7369

---------

Signed-off-by: karishma-chawla <[email protected]>
Co-authored-by: karishma-chawla <[email protected]>
  • Loading branch information
kachawla and kachawla authored Mar 27, 2024
1 parent f863351 commit 1fc0eaa
Showing 1 changed file with 7 additions and 11 deletions.
18 changes: 7 additions & 11 deletions pkg/armrpc/asyncoperation/worker/worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -251,25 +251,21 @@ func (w *AsyncRequestProcessWorker) runOperation(ctx context.Context, message *q

logger.Info("Start processing operation.")
result, err := asyncCtrl.Run(asyncReqCtx, asyncReq)

code := ""
if result.Error != nil {
code = result.Error.Code
// Update the result if an error is returned from the controller.
// Check that the result is empty to ensure we don't override it, it shouldn't happen.
// Controller should always either return non-empty error or non-empty result, but not both.
if err != nil && result.Error == nil {
armErr := extractError(err)
result.SetFailed(armErr, false)
}

logger.Info("Operation returned", "success", result.Error == nil, "code", code, "provisioningState", result.ProvisioningState(), "err", err)
logger.Info("Operation returned", "success", result.Error == nil, "provisioningState", result.ProvisioningState(), "err", result.Error)

// There are two cases when asyncReqCtx is canceled.
// 1. When the operation is timed out, w.completeOperation will be called in L186
// 2. When parent context is canceled or done, we need to requeue the operation to reprocess the request.
// Such cases should not call w.completeOperation.
if !errors.Is(asyncReqCtx.Err(), context.Canceled) {
if err != nil {
armErr := extractError(err)
result.SetFailed(armErr, false)
logger.Error(err, "Operation Failed")
}

w.completeOperation(ctx, message, result, asyncCtrl.StorageClient())
}
trace.SetAsyncResultStatus(result, span)
Expand Down

0 comments on commit 1fc0eaa

Please sign in to comment.