should we only reload the rules if it's safe to do so? #34

nathwill · 2014-11-26T22:40:28Z

seen this a few times where for whatever reason an exception is thrown that causes the chef run to abort before all of the iptables_ng_chain or iptables_ng_rule resources have been processed, but after the notifications to ruby_block[create_rules] and ruby_block[restart_iptables] are already queued up; for whatever reason, Chef still triggers the notified actions during the exit.

The frustrating outcome of the above is that an incomplete set of iptables rules is applied, often to the effect of a totally empty INPUT chain having it's policy set to DROP, with all the joy that entails. Best case, this happens during testing, and you're just unable to login to the test instance to inspect; worse case, this happens during the initial converge of a new instance and requires destroying and recreating the entire VM, or popping onto an IPMI console if you're working with physical gear.

I'd be happy to send over a PR if this seems like something worth trying to avoid, but wanted to get some input on what you'd prefer to see before getting started, as there's a few options that come to mind:

skip create and/or restart if the run had an exception
skip create and/or restart if the INPUT chain on any of the managed tables has a DROP policy and no other rules as this is guaranteed to be a lock-out situation.
consider this chef's problem for running delayed actions after an error and pursue a fix with chef.

thoughts?

The text was updated successfully, but these errors were encountered:

chr4 · 2014-11-27T13:58:53Z

You are right, this behaviour of Chef can lead to lockouts.
The "Chef failed, but I'm still running service restarts for services" behaviour was introduced to Chef a while ago.
I'm not sure if there's a (not too dirty) workaround to solve this on a cookbook level. Maybe someone from Chef (like @jtimberman) can comment on this?

nathwill · 2015-11-14T20:16:20Z

so now that chef 12.5 is out, what do you think about hooking into the event_handler dsl when it's available? triggering the reload on the converge_complete action would prevent the reload from happening if there were converge failures.

i'd be happy to put a PR together for review if you're interested :)

chr4 · 2015-11-16T14:41:24Z

I'm in favor of this approach, and I'd welcome a pull-request!

If possible, we should try to implement it in a way that is backwards compatible, as it will take a while until most people will have upgraded to Chef-12.5. As iptables is a fairly wide-spread tool, I think respecting conservative upgrade pattern is a must.

As this approach (if it works) would increase the stability of this cookbook quite massively, I'm willing to consider releasing it as a new major version, in case backwards compatibility is not easily achieved/ impossible.

chr4 · 2015-11-28T11:24:26Z

I tested the hook in another cookbook, it works quite well. I'm not sure howto keep the code compatible though, it won't run on Chef < 12.5

nathwill · 2015-11-30T18:17:47Z

this is what we're doing for the systemd cookbook; it wouldn't solve the "only-reload if error-free" problem for pre-chef-12, but would be identical to current functionality, with the enhancement available for post-chef-12 installations.

there's probably some way to check for errors pre-chef-11, but i'm not sure exactly how at the moment.

chr4 · 2015-11-30T21:02:23Z

This looks like a decent solution! The legacy code doesn't have to be better than it already is, I only do not want to break running (legacy) systems.

Is your offer creating a pull-request still on? I'd love to review it! :)

nathwill · 2015-11-30T22:14:37Z

yeah, for sure. sorry for the delay, it's been a bit hectic with the holidays and this slipped off my radar for a bit. i added it back to my todo list, so expect to see something in the next few days :)

chr4 · 2015-11-30T22:19:11Z

Looking forward to it!

nathwill mentioned this issue Dec 5, 2015

[WIP] Add optional "safe-reload" mode #56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should we only reload the rules if it's safe to do so? #34

should we only reload the rules if it's safe to do so? #34

nathwill commented Nov 26, 2014

chr4 commented Nov 27, 2014

nathwill commented Nov 14, 2015

chr4 commented Nov 16, 2015

chr4 commented Nov 28, 2015

nathwill commented Nov 30, 2015

chr4 commented Nov 30, 2015

nathwill commented Nov 30, 2015

chr4 commented Nov 30, 2015

should we only reload the rules if it's safe to do so? #34

should we only reload the rules if it's safe to do so? #34

Comments

nathwill commented Nov 26, 2014

chr4 commented Nov 27, 2014

nathwill commented Nov 14, 2015

chr4 commented Nov 16, 2015

chr4 commented Nov 28, 2015

nathwill commented Nov 30, 2015

chr4 commented Nov 30, 2015

nathwill commented Nov 30, 2015

chr4 commented Nov 30, 2015