Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should we only reload the rules if it's safe to do so? #34

Open
nathwill opened this issue Nov 26, 2014 · 8 comments
Open

should we only reload the rules if it's safe to do so? #34

nathwill opened this issue Nov 26, 2014 · 8 comments

Comments

@nathwill
Copy link
Contributor

seen this a few times where for whatever reason an exception is thrown that causes the chef run to abort before all of the iptables_ng_chain or iptables_ng_rule resources have been processed, but after the notifications to ruby_block[create_rules] and ruby_block[restart_iptables] are already queued up; for whatever reason, Chef still triggers the notified actions during the exit.

The frustrating outcome of the above is that an incomplete set of iptables rules is applied, often to the effect of a totally empty INPUT chain having it's policy set to DROP, with all the joy that entails. Best case, this happens during testing, and you're just unable to login to the test instance to inspect; worse case, this happens during the initial converge of a new instance and requires destroying and recreating the entire VM, or popping onto an IPMI console if you're working with physical gear.

I'd be happy to send over a PR if this seems like something worth trying to avoid, but wanted to get some input on what you'd prefer to see before getting started, as there's a few options that come to mind:

  • skip create and/or restart if the run had an exception
  • skip create and/or restart if the INPUT chain on any of the managed tables has a DROP policy and no other rules as this is guaranteed to be a lock-out situation.
  • consider this chef's problem for running delayed actions after an error and pursue a fix with chef.

thoughts?

@chr4
Copy link
Member

chr4 commented Nov 27, 2014

You are right, this behaviour of Chef can lead to lockouts.
The "Chef failed, but I'm still running service restarts for services" behaviour was introduced to Chef a while ago.
I'm not sure if there's a (not too dirty) workaround to solve this on a cookbook level. Maybe someone from Chef (like @jtimberman) can comment on this?

@nathwill
Copy link
Contributor Author

so now that chef 12.5 is out, what do you think about hooking into the event_handler dsl when it's available? triggering the reload on the converge_complete action would prevent the reload from happening if there were converge failures.

i'd be happy to put a PR together for review if you're interested :)

@chr4
Copy link
Member

chr4 commented Nov 16, 2015

I'm in favor of this approach, and I'd welcome a pull-request!

If possible, we should try to implement it in a way that is backwards compatible, as it will take a while until most people will have upgraded to Chef-12.5. As iptables is a fairly wide-spread tool, I think respecting conservative upgrade pattern is a must.

As this approach (if it works) would increase the stability of this cookbook quite massively, I'm willing to consider releasing it as a new major version, in case backwards compatibility is not easily achieved/ impossible.

@chr4
Copy link
Member

chr4 commented Nov 28, 2015

I tested the hook in another cookbook, it works quite well. I'm not sure howto keep the code compatible though, it won't run on Chef < 12.5

@nathwill
Copy link
Contributor Author

this is what we're doing for the systemd cookbook; it wouldn't solve the "only-reload if error-free" problem for pre-chef-12, but would be identical to current functionality, with the enhancement available for post-chef-12 installations.

there's probably some way to check for errors pre-chef-11, but i'm not sure exactly how at the moment.

@chr4
Copy link
Member

chr4 commented Nov 30, 2015

This looks like a decent solution! The legacy code doesn't have to be better than it already is, I only do not want to break running (legacy) systems.

Is your offer creating a pull-request still on? I'd love to review it! :)

@nathwill
Copy link
Contributor Author

yeah, for sure. sorry for the delay, it's been a bit hectic with the holidays and this slipped off my radar for a bit. i added it back to my todo list, so expect to see something in the next few days :)

@chr4
Copy link
Member

chr4 commented Nov 30, 2015

Looking forward to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants