Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(STONEBLD-2265): increase chains timeout since chains controller is overwhelmed #1091

Merged
merged 1 commit into from
Mar 21, 2024

Conversation

dheerajodha
Copy link
Member

@dheerajodha dheerajodha commented Mar 20, 2024

Description

  • Because the build-definitions PR checks are running on the production cluster, due to the heavy load chains controller is overwhelemed more often.
  • This is causing test timeout while waiting for chains controller to attest artifacts.
  • These days, we're hitting the worst case more often making it difficult for us to merge PRs on build-def repo.
  • This PR increases the timeout from 30 mins to 90mins
  • We should reduce this timeout as soon as the situation is improved by shifting the checks to execute on the staging cluster instead, which has comparatively less load.

Issue ticket number and link

STONEBLD-2265

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added meaningful description with JIRA/GitHub issue key(if applicable), for example HASSuiteDescribe("STONE-123456789 devfile source")
  • I have updated labels (if needed)

* Because the build-definitions PR checks are running on
  the production cluster, due to the heavy load, chains
  controller is overwhelemed more often.
* This is causing test timeouts while waiting for chains
  controller to attest artifacts.
* These days, we're hitting the worst case more often
  making it difficult for us to merge PRs on build-def
  repo.
* This PR increases the timeout from 30 mins to 90mins
* We should reduce this timeout as soon as the situation
  is improved, one option, by shifting the checks to
  execute on the staging cluster instead, which has
  comparatively less load.

Signed-off-by: Dheeraj<[email protected]>
@dheerajodha dheerajodha force-pushed the fix/extend-chains-timeout branch from 6b444f8 to 0fd94e0 Compare March 20, 2024 13:54
Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

dheerajodha added a commit to dheerajodha/build-definitions that referenced this pull request Mar 20, 2024
* This change increase the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 90 mins.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelemed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the same timeout value used by test, 90m.

Signed-off-by: Dheeraj<[email protected]>
dheerajodha added a commit to dheerajodha/build-definitions that referenced this pull request Mar 20, 2024
* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 90 mins.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the same timeout value used by test, 90m.

Signed-off-by: Dheeraj<[email protected]>
@dheerajodha dheerajodha changed the title fix: increase chains timeout since chains controller is overwhelmed fix(STONEBLD-2265): increase chains timeout since chains controller is overwhelmed Mar 20, 2024
@tisutisu
Copy link
Contributor

/lgtm
/approve

Copy link

openshift-ci bot commented Mar 20, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MartinBasti, tisutisu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MartinBasti
Copy link
Contributor

/retest-required

Copy link

openshift-ci bot commented Mar 20, 2024

🚨 Error occurred while running the E2E tests, list of failed Spec(s):

  • ➡️ [failed] [It] [jvm-build-service-suite JVM Build Service E2E tests] when the Component with s2i-java component is created a PipelineRun is triggered [jvm-build, HACBS]
Expected success, but got an error:
    <context.deadlineExceededError>: 
    context deadline exceeded
    {}

@dheerajodha: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/redhat-appstudio-e2e 0fd94e0 link true /test redhat-appstudio-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@psturc
Copy link
Member

psturc commented Mar 20, 2024

the last CI run failed on known issue

I think we can force-merge this PR, wdyt @MartinBasti @dheerajodha ?

dheerajodha added a commit to dheerajodha/build-definitions that referenced this pull request Mar 21, 2024
* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 2hrs.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the 2hrs to give enough time for chains
  as well as other tests to execute within time.

Signed-off-by: Dheeraj<[email protected]>
dheerajodha added a commit to dheerajodha/build-definitions that referenced this pull request Mar 21, 2024
* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 2hrs.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the 2hrs to give enough time for chains
  as well as other tests to execute within time.

Signed-off-by: Dheeraj<[email protected]>
@MartinBasti
Copy link
Contributor

fine by me to merge it

@psturc psturc merged commit 791b26e into konflux-ci:main Mar 21, 2024
14 of 16 checks passed
dheerajodha added a commit to dheerajodha/build-definitions that referenced this pull request Mar 21, 2024
* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 2hrs.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the 2hrs to give enough time for chains
  as well as other tests to execute within time.

Signed-off-by: Dheeraj<[email protected]>
chmeliik pushed a commit to dheerajodha/build-definitions that referenced this pull request Mar 21, 2024
* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 2hrs.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the 2hrs to give enough time for chains
  as well as other tests to execute within time.

Signed-off-by: Dheeraj<[email protected]>
mmorhun pushed a commit to konflux-ci/build-definitions that referenced this pull request Mar 21, 2024
… PLR & e2e-tests Task to 2hrs (#893)

* fix(STONEBLD-2265): increase the default timeout to 90mins

* This change increases the default timeout (60mins) of the
  PR-type PLR and the 'e2e-tests' Task to 2hrs.
* This is a temporary change, please see:
  konflux-ci/e2e-tests#1091
* Since the chains controller is overwhelmed, we had to
  increase the test timeout value to 90mins.
* So this current PR increases the timeout values for PLR
  and a Task to the 2hrs to give enough time for chains
  as well as other tests to execute within time.

Signed-off-by: Dheeraj<[email protected]>

* update .tekton/tasks/e2e-test.yaml

---------

Signed-off-by: Dheeraj<[email protected]>
Co-authored-by: rh-tap-build-team[bot] <127938674+rh-tap-build-team[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants