Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task results not templated if task order is reversed #7103

Closed
gerrnot opened this issue Sep 12, 2023 · 4 comments · Fixed by #7169
Closed

task results not templated if task order is reversed #7103

gerrnot opened this issue Sep 12, 2023 · 4 comments · Fixed by #7169
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. regression Indicates an issue or a PR is associated to the regression in the project

Comments

@gerrnot
Copy link

gerrnot commented Sep 12, 2023

Expected Behavior

In a kind: "Pipeline", where you define the tasks (list), the order should not matter because the order in tekton is normally defined via the runAfter attribute or in case of a resource dependencies via a syntax like $(tasks.<task-name>.results.<result-name>).

Actual Behavior

However, it can be demonstrated, in the latest release of tekton pipelines (v0.51.0) that the ordering in the task list matters.
That is, tasks might behave in very awkward ways, depending on how they treat receiving an actual result string vs an untemplated string like $(tasks.<task-name>.results.<result-name>).

Steps to Reproduce the Problem

Unpack the attached tar.gz (contains only yaml files)

  1. k apply -f task-exe.yaml
  2. k apply -f pipeline-order-wrong.yaml
  3. k create -f pipelinerun.yaml
  4. The resulting pipelinerun will fail, reproducing the error.

Additional Info

  • Kubernetes version:

    Output of kubectl version:

Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.8", GitCommit:"395f0a2fdc940aeb9ab88849e8fa4321decbf6e1", GitTreeState:"clean", BuildDate:"2023-08-24T00:43:07Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}
  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

v0.51.0

cd-pipeline-dev-reduced.tar.gz

@gerrnot gerrnot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 12, 2023
@adelmoradian
Copy link
Contributor

I also bumped into this issue after upgrading to v0.51.0

Here is an example that showcases this issue:

This following example runs task3 after task1 and task2 however the output is

$(tasks.task1.results.foo)
$(tasks.task2.results.bar)
apiVersion: tekton.dev/v1
kind: PipelineRun
metadata:
  generateName: pr-echo-
spec:
  pipelineSpec:
    tasks:
      - name: task1
        taskSpec:
          results:
            - name: foo
          steps:
            - name: step1
              env:
                - name: RESULT_PATH
                  value: $(results.foo.path)
              image: ubuntu
              script: |
                #!/usr/bin/env bash
                echo "foo-value" > "$RESULT_PATH"

      - name: task3
        runAfter:
          - task1
          - task2
        params:
          - name: foo-value
            value: $(tasks.task1.results.foo)
          - name: bar-value
            value: $(tasks.task2.results.bar)
        taskSpec:
          steps:
            - name: step1
              env:
                - name: FOO_VALUE
                  value: $(params.foo-value)
                - name: BAR_VALUE
                  value: $(params.bar-value)
              image: ubuntu
              script: |
                #!/usr/bin/env bash
                echo "$FOO_VALUE"
                echo "$BAR_VALUE"

      - name: task2
        taskSpec:
          results:
            - name: bar
          steps:
            - name: step1
              env:
                - name: RESULT_PATH
                  value: $(results.bar.path)
              image: ubuntu
              script: |
                #!/usr/bin/env bash
                echo "bar-value" > "$RESULT_PATH"

If i put task3 after task2 however, the output is correct

foo-value

bar-value

@Tomcli
Copy link
Contributor

Tomcli commented Sep 27, 2023

This issue also happened in the latest LTS version 0.50.1 but it was working fine in the previous LTS version 0.47.4. I also have an example to reproduce this issue at #7155.

@afrittoli afrittoli added the regression Indicates an issue or a PR is associated to the regression in the project label Sep 29, 2023
@afrittoli
Copy link
Member

Thanks @Tomcli @adelmoradian and @gerrnot for reporting this. I verified that the issue was introduced in v0.50.0 - in v0.49.0 it still worked fine. I will now try and track down the exact source.

@afrittoli afrittoli self-assigned this Sep 29, 2023
@afrittoli afrittoli added this to the Pipelines v0.53 LTS milestone Sep 29, 2023
@afrittoli
Copy link
Member

I believe this was introduced in #6792.

What happens in the logic is today:

  • the pipeline run controller builds the status of the pipeline run in resolvePipelineState
  • the function loops over all pipeline tasks and incrementally builds a state, using ResolvePipelineTask
  • ResolvePipelineTask relies on the state to resolve result references. The state it's passed though is partial and it contains only tasks processed in the loop that far. As long as tasks are defined in the order of execution this ~works

This need to be changed so that a state without result resolution is built first, and result references are resolved then.
Unfortunately it's not possible to simply revert the original PR, as that would remove a feature, so the fix to this regression will be an additive.

afrittoli added a commit to afrittoli/pipeline that referenced this issue Oct 3, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test (TBD) may be able to test the fix.

Fixes: tektoncd#7103

Signed-off-by: Andrea Frittoli <[email protected]>
afrittoli added a commit to afrittoli/pipeline that referenced this issue Oct 3, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test (TBD) may be able to test the fix.

Fixes: tektoncd#7103

Signed-off-by: Andrea Frittoli <[email protected]>
afrittoli added a commit to afrittoli/pipeline that referenced this issue Oct 3, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: tektoncd#7103

Signed-off-by: Andrea Frittoli <[email protected]>
tekton-robot pushed a commit that referenced this issue Oct 3, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: #7103

Signed-off-by: Andrea Frittoli <[email protected]>
tekton-robot pushed a commit to tekton-robot/pipeline that referenced this issue Oct 4, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: tektoncd#7103

Signed-off-by: Andrea Frittoli <[email protected]>
tekton-robot pushed a commit to tekton-robot/pipeline that referenced this issue Oct 4, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: tektoncd#7103

Signed-off-by: Andrea Frittoli <[email protected]>
tekton-robot pushed a commit that referenced this issue Oct 4, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: #7103

Signed-off-by: Andrea Frittoli <[email protected]>
tekton-robot pushed a commit that referenced this issue Oct 5, 2023
The pipeline run reconciler builds a pipeline run state on every
run, which resolves task references, expands result and processes
matrix fan outs.

The current process is incremental in a single loop, where each
new PipelineTask resolution depends on the state of PipelineTasks
resolved before. This is problematic because tasks are not
necessarily defined in the pipeline in order of execution (which
is undefined, given that pipelines are DAGs).

Since this PR is a fix to a regression, it aims to be as minimal
as possible. The smallest solution available is to implement some
sorting in the list of tasks, so that the incremental state
can work correctly.

This PR splits the process into two runs, one for tasks that have
been already started (and possibly completed), and a second one
that includes all remaining tasks. The first group of task does
not need matrix fan outs (they have already been processed) or
result resolution, so its state can be safely build incrementally.

The second group is executed starting from the state of the second
group. Any task that is a candidate for execution in this this
reconcile cycle must have its results resolved through the state
of the first group.

Testing with the current code arrangement is a bit challenging,
as we ignore result resolution errors in the code, which is ok
only in some cases:
- result resolution due to task not found or result not defined
  is permanent and should not be ignored
- result resolution due to a result not being available yet is
  ephemeral (possibly) and should not cause a failure

Currently one function checks for all these conditions and
returns one error, so it's not possible to safely distinguish
them. This will require some refactoring to be fixed in a follow
up patch.

For now, a reconcile unit test can test the fix.

Fixes: #7103

Signed-off-by: Andrea Frittoli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. regression Indicates an issue or a PR is associated to the regression in the project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants