Regression: fix results with out of order tasks #7169

afrittoli · 2023-10-03T08:53:28Z

Changes

The pipeline run reconciler builds a pipeline run state on every run, which resolves task references, expands result and processes matrix fan outs.

The current process is incremental in a single loop, where each new PipelineTask resolution depends on the state of PipelineTasks resolved before. This is problematic because tasks are not necessarily defined in the pipeline in order of execution (which is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal as possible. The smallest solution available is to implement some sorting in the list of tasks, so that the incremental state can work correctly.

This PR splits the process into two runs, one for tasks that have been already started (and possibly completed), and a second one that includes all remaining tasks. The first group of task does not need matrix fan outs (they have already been processed) or result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second group. Any task that is a candidate for execution in this this reconcile cycle must have its results resolved through the state of the first group.

Testing with the current code arrangement is a bit challenging, as we ignore result resolution errors in the code, which is ok only in some cases:

result resolution due to task not found or result not defined is permanent and should not be ignored
result resolution due to a result not being available yet is ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and returns one error, so it's not possible to safely distinguish them. This will require some refactoring to be fixed in a follow up patch.

For now, a reconcile unit test can test the fix.

Fixes: #7103

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
Has Tests included if any functionality added or changed
Follows the commit message standard
Meets the Tekton contributor standards (including functionality, content, code)
Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

Fix regression where a different order of task definition may cause result resolution to break

/kind bug

tekton-robot · 2023-10-03T09:00:37Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/pipelinerun/pipelinerun.go	92.7%	92.4%	-0.3

tekton-robot · 2023-10-03T09:01:42Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/pipelinerun/pipelinerun.go	92.7%	92.4%	-0.3

tekton-robot · 2023-10-03T09:02:43Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/pipelinerun/pipelinerun.go	92.7%	92.4%	-0.3

The pipeline run reconciler builds a pipeline run state on every run, which resolves task references, expands result and processes matrix fan outs. The current process is incremental in a single loop, where each new PipelineTask resolution depends on the state of PipelineTasks resolved before. This is problematic because tasks are not necessarily defined in the pipeline in order of execution (which is undefined, given that pipelines are DAGs). Since this PR is a fix to a regression, it aims to be as minimal as possible. The smallest solution available is to implement some sorting in the list of tasks, so that the incremental state can work correctly. This PR splits the process into two runs, one for tasks that have been already started (and possibly completed), and a second one that includes all remaining tasks. The first group of task does not need matrix fan outs (they have already been processed) or result resolution, so its state can be safely build incrementally. The second group is executed starting from the state of the second group. Any task that is a candidate for execution in this this reconcile cycle must have its results resolved through the state of the first group. Testing with the current code arrangement is a bit challenging, as we ignore result resolution errors in the code, which is ok only in some cases: - result resolution due to task not found or result not defined is permanent and should not be ignored - result resolution due to a result not being available yet is ephemeral (possibly) and should not cause a failure Currently one function checks for all these conditions and returns one error, so it's not possible to safely distinguish them. This will require some refactoring to be fixed in a follow up patch. For now, a reconcile unit test can test the fix. Fixes: tektoncd#7103 Signed-off-by: Andrea Frittoli <[email protected]>

tekton-robot · 2023-10-03T12:33:09Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/pipelinerun/pipelinerun.go	92.7%	92.4%	-0.3

tekton-robot · 2023-10-03T12:35:30Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/pipelinerun/pipelinerun.go	92.7%	92.4%	-0.3

EmmaMunley

lgtm

tekton-robot · 2023-10-03T13:48:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: EmmaMunley, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [vdemeester]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Yongxuanzhang

/lgtm
Thanks!!

Yongxuanzhang · 2023-10-03T14:54:24Z

pkg/reconciler/pipelinerun/pipelinerun.go

@@ -536,7 +536,46 @@ func (c *Reconciler) reconcile(ctx context.Context, pr *v1.PipelineRun, getPipel
 	if len(pipelineSpec.Finally) > 0 {
 		tasks = append(tasks, pipelineSpec.Finally...)
 	}
-	pipelineRunState, err := c.resolvePipelineState(ctx, tasks, pipelineMeta.ObjectMeta, pr)
+
+	// We spit tasks in two lists:


Oh, nice, thanks, I will fix in a follow up

Yongxuanzhang · 2023-10-03T14:54:32Z

pkg/reconciler/pipelinerun/pipelinerun.go

+	// a PipelineTask has at least one TaskRun associated, then all its TaskRuns have been
+	// created already.
+	// The second group takes as input the partial state built in the first iteration and finally
+	// the two results are collated


nit: collected

I actually meant collated

Oh sorry I misunderstood here. 😄

afrittoli · 2023-10-04T15:33:31Z

/cherry-pick release-v0.50.x

tekton-robot · 2023-10-04T15:34:15Z

@afrittoli: new pull request created: #7173

In response to this:

/cherry-pick release-v0.50.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

afrittoli · 2023-10-04T15:34:15Z

/cherry-pick release-v0.52.x

tekton-robot · 2023-10-04T15:35:07Z

@afrittoli: new pull request created: #7174

In response to this:

/cherry-pick release-v0.52.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot requested review from lbernick and wlynch October 3, 2023 08:53

afrittoli force-pushed the 7103 branch from 648c803 to a801d03 Compare October 3, 2023 08:54

tekton-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 3, 2023

afrittoli force-pushed the 7103 branch from a801d03 to dd2135a Compare October 3, 2023 12:27

tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 3, 2023

afrittoli changed the title ~~[WIP] Regression: fix results with out of order tasks~~ Regression: fix results with out of order tasks Oct 3, 2023

tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 3, 2023

vdemeester approved these changes Oct 3, 2023

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2023

EmmaMunley approved these changes Oct 3, 2023

View reviewed changes

Yongxuanzhang reviewed Oct 3, 2023

View reviewed changes

tekton-robot assigned Yongxuanzhang Oct 3, 2023

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 3, 2023

tekton-robot merged commit cbabe7f into tektoncd:main Oct 3, 2023
2 checks passed

Tomcli mentioned this pull request Oct 3, 2023

Regression on parameter inputs when switching pipelinerun spec's task order #7155

Closed

afrittoli mentioned this pull request Oct 4, 2023

Typo in comment #7172

Open

afrittoli added the needs-cherry-pick Indicates a PR needs to be cherry-pick to a release branch label Oct 4, 2023

tekton-robot mentioned this pull request Oct 4, 2023

[release-v0.50.x] Regression: fix results with out of order tasks #7173

Merged

tekton-robot mentioned this pull request Oct 4, 2023

[release-v0.52.x] Regression: fix results with out of order tasks #7174

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: fix results with out of order tasks #7169

Regression: fix results with out of order tasks #7169

afrittoli commented Oct 3, 2023 •

edited

Loading

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

EmmaMunley left a comment

tekton-robot commented Oct 3, 2023

Yongxuanzhang left a comment

Yongxuanzhang Oct 3, 2023

afrittoli Oct 4, 2023

Yongxuanzhang Oct 3, 2023

afrittoli Oct 3, 2023

Yongxuanzhang Oct 3, 2023

afrittoli commented Oct 4, 2023

tekton-robot commented Oct 4, 2023

afrittoli commented Oct 4, 2023

tekton-robot commented Oct 4, 2023

Regression: fix results with out of order tasks #7169

Regression: fix results with out of order tasks #7169

Conversation

afrittoli commented Oct 3, 2023 • edited Loading

Changes

Submitter Checklist

Release Notes

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

tekton-robot commented Oct 3, 2023

EmmaMunley left a comment

Choose a reason for hiding this comment

tekton-robot commented Oct 3, 2023

Yongxuanzhang left a comment

Choose a reason for hiding this comment

Yongxuanzhang Oct 3, 2023

Choose a reason for hiding this comment

afrittoli Oct 4, 2023

Choose a reason for hiding this comment

Yongxuanzhang Oct 3, 2023

Choose a reason for hiding this comment

afrittoli Oct 3, 2023

Choose a reason for hiding this comment

Yongxuanzhang Oct 3, 2023

Choose a reason for hiding this comment

afrittoli commented Oct 4, 2023

tekton-robot commented Oct 4, 2023

afrittoli commented Oct 4, 2023

tekton-robot commented Oct 4, 2023

afrittoli commented Oct 3, 2023 •

edited

Loading