-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEP-0050] Implement PipelineTask OnError #7422
[TEP-0050] Implement PipelineTask OnError #7422
Conversation
Skipping CI for Draft Pull Request. |
/test all |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
0a34d79
to
98fd1ec
Compare
/retest |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
/assign |
98fd1ec
to
df36337
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
test/ignore_task_error_test.go
Outdated
steps: | ||
- name: failing-step | ||
image: busybox | ||
script: exit 1; echo -n 123 | tee $(results.result1.path)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are missing one quote ' here.
I'm not sure why this doesn't affect the test being run.
Curious how it went through in CI, coz when I create this PipelineRun locally this'd fail runtime validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing quote added. Interestingly, I tried the same yamlfile and test locally. I verified the resources and test are created with no problem 🤔
updateCompletedTaskRunStatus(logger, trs, pod) | ||
onError, ok := tr.Annotations[v1.PipelineTaskOnErrorAnnotation] | ||
if ok { | ||
updateCompletedTaskRunStatus(logger, trs, pod, v1.PipelineTaskOnErrorType(onError)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help check my understanding here:
I saw this func called at taskrun reconciler, but what if the TaskRun fails at earlier stage i.e. pvc creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JeromeJu. This is a good questions.
TEP-0050 is designed with requirement that the task would be considered "successful" for the purposes of determining the status of the pipeline
(i.e. the TaskRun
itself should still be failed, but with reason FailureIgnored
). Looking at the usecases in TEP-0050, it is mainly used to ignore pod error itself.
I think your question is that for the validation/reconciler errors (like the pvc creation
), should we also mark the reason as FailureIgnored
?
I think it might not be a good idea make the validation/reconciler errors with reason FailureIgnored
:
- The validation/reconciler errors is likely from the Pipeline/Task Author, it shouldn't be ignored, but should be raised and fixed,
FailureIgnored
error reason could be confusing. - the error reason
FailureIgnored
becomes too generic if we use it across the TaskRun reconciler code. Instead, I'd prefer to keep the scope ofFailureIgnored
reason to the pod execution time.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @QuanZhang-William for the offline conversation and completing the 2nd half of my question!
And yes, I do agree that in actual use cases, only the container(Step) failures in the pod, which indicates users' own test failures, shall be ignored but not the "preparation" of those steps i.e. pvc creation and validations. Thanks for the explanation here.
So I think the current implementation is the way to go. One small ask is maybe we would can add this explanation/conversation somewhere in the docString or maybe link to the TEP050 for this design choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense as long as we document that FailedIgnored
is ignoring a failure when the actual task execution fails but not otherwise (validation failure, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG. I have added the docstring. PTAL
cc @tektoncd/core-collaborators @tektoncd/core-maintainers @jerop @pritidesai . Please take a look if you are interested! |
Overall, this looks good to me! Thanks @QuanZhang-William Once you address the remaining open comments, I can approve. |
df36337
to
34ef7fd
Compare
Thanks @dibyom. I've addressed the comments. PTAL! |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
Part of [tektoncd#7165][tektoncd#7165]. In [TEP-0050][tep-0050], we proposed to add an `OnError` API field under `PipelineTask` to configure error handling strategy. This commits leverages `PipelineTask.OnError` API field introduced in the previous PR, implement the error handling strategy, update related docs and tests. /kind feature [tep-0050]: https://github.com/tektoncd/community/blob/main/teps/0050-ignore-task-failures.md [tektoncd#7165]: tektoncd#7165
34ef7fd
to
a4f7be7
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @QuanZhang-William
One nit on release note, might be helpful to also include how users could use with the OnError feature i.e. "users can now set OnError to 'continue' to ignore failure from previous task".
@@ -278,11 +280,11 @@ func (state PipelineRunState) getNextTasks(candidateTasks sets.String) []*Resolv | |||
} | |||
|
|||
// IsStopping returns true if the PipelineRun won't be scheduling any new Task because | |||
// at least one task already failed or was cancelled in the specified dag | |||
// at least one task already failed (with onError: stopAndFail) or was cancelled in the specified dag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "stopAndFail" is the default here this sounds like users have to set OnError?
Will leave this up to your decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other feedback on this before merge? cc @vdemeester @jerop @pritidesai @afrittoli
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dibyom, JeromeJu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @dibyom. This PR has been opened for more than a week. Was wondering if we can merge? 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Changes
Part of #7165. In TEP-0050, we proposed to add an
OnError
API field underPipelineTask
to configure error handling strategy.This commits leverages
PipelineTask.OnError
API field introduced in the previous PR, implement the error handling strategy, update related docs and tests./kind feature
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>
. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes