Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To extend the existing wait_for states with timeout parameter #34

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tiwuu
Copy link

@tiwuu tiwuu commented May 31, 2018

To extend the existing two wait_for states with support for parametric timeouts, timeout value is taken from reclass pillar data.
wait_for_deployed.sls
wait_for_ready.sls

Signed-off-by: ting wu [email protected]

@tiwuu tiwuu changed the title To extend the existing two wait_for states with support To extend the existing wait_for states with timeout parameter May 31, 2018
@epcim
Copy link
Member

epcim commented May 31, 2018

Thank you.

Can you please:

  • rebase once Add kitchen tests to formula #33 is merged to master (will happen within 4h max)
  • update README examples
  • add timeout option to at least one test/pillar

@epcim
Copy link
Member

epcim commented Jun 4, 2018

The #33 was merged, please rebase. Thanks.

@tiwuu
Copy link
Author

tiwuu commented Jun 4, 2018

Thank epcim. I will let you know when I finish.

To extend the wait_for states with timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.

wait_for_deployed.sls
wait_for_ready.sls

Signed-off-by: ting wu <[email protected]>
@tiwuu
Copy link
Author

tiwuu commented Jun 4, 2018

Hi, epcim
I pushed the changes , include:

  1. rebase on top of curtin: arm64: Fix missing newline after j2 parse #31 and Add kitchen tests to formula #33
  2. modify README
  3. modify the tests/pillar/maas_region.sls to add timeout paramater

alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Sep 23, 2018
Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.

Based on Ting's PR [1], slightly refactored.

[1] salt-formulas#34

Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Sep 23, 2018
Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.

Based on Ting's PR [1], slightly refactored.

[1] salt-formulas#34

Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Nov 8, 2018
1. maas.py: Extend wait_for states with timeout param

Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.
Based on Ting's PR [1], slightly refactored.

2. maas.py: wait_for_*: Add attempts arg

Introduce a new parameter that allows a maximum number of automatic
recovery attempts for the common failures w/ machine operations.
If not present in pillar data, it defaults to 0 (OFF).

Common error states, possible cause and automatic recovery pattern:
* New
  - usually indicates issues with BMC connectivity (no network route,
    but on rare occassions it happens due to MaaS API being flaky);
  - fix: delete the machine, (re)process machine definitions;
* Failed commissioning
  - various causes, usually a simple retry works;
  - fix: delete the machine, (re)process machine definitions;
* Failed testing
  - incompatible hardware, missing drivers etc.
  - usually consistent and board-specific;
  - fix: override failed testing
* Allocated
  - on rare ocassions nodes get stuck in this state instead 'Deploy';
  - fix: mark-broken, mark-fixed, if it failed at least once before
    perform a fio test (fixes another unrelated spurious issue with
    encrypted disks from previous deployments), (re)deploy machines;
* Failed deployment
  - various causes, usually a simple retry works;
  - fix: same as for nodes stuck in 'Allocated';

[1] salt-formulas#34

Change-Id: Ifb7dd9f8fcfbbed557e47d8fdffb1f963604fb15
Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Nov 8, 2018
1. maas.py: Extend wait_for states with timeout param

Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.
Based on Ting's PR [1], slightly refactored.

2. maas.py: wait_for_*: Add attempts arg

Introduce a new parameter that allows a maximum number of automatic
recovery attempts for the common failures w/ machine operations.
If not present in pillar data, it defaults to 0 (OFF).

Common error states, possible cause and automatic recovery pattern:
* New
  - usually indicates issues with BMC connectivity (no network route,
    but on rare occassions it happens due to MaaS API being flaky);
  - fix: delete the machine, (re)process machine definitions;
* Failed commissioning
  - various causes, usually a simple retry works;
  - fix: delete the machine, (re)process machine definitions;
* Failed testing
  - incompatible hardware, missing drivers etc.
  - usually consistent and board-specific;
  - fix: override failed testing
* Allocated
  - on rare ocassions nodes get stuck in this state instead 'Deploy';
  - fix: mark-broken, mark-fixed, if it failed at least once before
    perform a fio test (fixes another unrelated spurious issue with
    encrypted disks from previous deployments), (re)deploy machines;
* Failed deployment
  - various causes, usually a simple retry works;
  - fix: same as for nodes stuck in 'Allocated';

[1] salt-formulas#34

Change-Id: Ifb7dd9f8fcfbbed557e47d8fdffb1f963604fb15
Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Dec 13, 2018
1. maas.py: Extend wait_for states with timeout param

Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.
Based on Ting's PR [1], slightly refactored.

2. maas.py: wait_for_*: Add attempts arg

Introduce a new parameter that allows a maximum number of automatic
recovery attempts for the common failures w/ machine operations.
If not present in pillar data, it defaults to 0 (OFF).

Common error states, possible cause and automatic recovery pattern:
* New
  - usually indicates issues with BMC connectivity (no network route,
    but on rare occassions it happens due to MaaS API being flaky);
  - fix: delete the machine, (re)process machine definitions;
* Failed commissioning
  - various causes, usually a simple retry works;
  - fix: delete the machine, (re)process machine definitions;
* Failed testing
  - incompatible hardware, missing drivers etc.
  - usually consistent and board-specific;
  - fix: override failed testing
* Allocated
  - on rare ocassions nodes get stuck in this state instead 'Deploy';
  - fix: mark-broken, mark-fixed, if it failed at least once before
    perform a fio test (fixes another unrelated spurious issue with
    encrypted disks from previous deployments), (re)deploy machines;
* Failed deployment
  - various causes, usually a simple retry works;
  - fix: same as for nodes stuck in 'Allocated';

[1] salt-formulas#34

Change-Id: Ifb7dd9f8fcfbbed557e47d8fdffb1f963604fb15
Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
alexandruavadanii added a commit to alexandruavadanii/salt-formula-maas that referenced this pull request Dec 14, 2018
1. maas.py: Extend wait_for states with timeout param

Extend the wait_for states with a timeout parameter.
The timeout value is taken from reclass pillar data if
defined. Oterwise, the states use the default value.
Based on Ting's PR [1], slightly refactored.

2. maas.py: Extend `req_status` support to multiple values

Previously, req_status could be one of the MaaS status strings, e.g.
'Ready'. Extend matching to '|'-separated statuses (e.g.
'Ready|Deployed') to allow idempotency in MaaS machine commissioning
and deployment cycles.

Also provide a `maas.machines.wait_for_ready_or_deployed` sls.

3. maas.py: wait_for_*: Add attempts arg

Introduce a new parameter that allows a maximum number of automatic
recovery attempts for the common failures w/ machine operations.
If not present in pillar data, it defaults to 0 (OFF).

Common error states, possible cause and automatic recovery pattern:
* New
  - usually indicates issues with BMC connectivity (no network route,
    but on rare occassions it happens due to MaaS API being flaky);
  - fix: delete the machine, (re)process machine definitions;
* Failed commissioning
  - various causes, usually a simple retry works;
  - fix: delete the machine, (re)process machine definitions;
* Failed testing
  - incompatible hardware, missing drivers etc.
  - usually consistent and board-specific;
  - fix: override failed testing
* Allocated
  - on rare ocassions nodes get stuck in this state instead 'Deploy';
  - fix: mark-broken, mark-fixed, if it failed at least once before
    perform a fio test (fixes another unrelated spurious issue with
    encrypted disks from previous deployments), (re)deploy machines;
* Failed deployment
  - various causes, usually a simple retry works;
  - fix: same as for nodes stuck in 'Allocated';

[1] salt-formulas#34

Change-Id: Ifb7dd9f8fcfbbed557e47d8fdffb1f963604fb15
Signed-off-by: ting wu <[email protected]>
Signed-off-by: Alexandru Avadanii <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants