Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for zones #32

Open
Copis opened this issue May 12, 2020 · 10 comments
Open

Add support for zones #32

Copis opened this issue May 12, 2020 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@Copis
Copy link

Copis commented May 12, 2020

Is your feature request related to a problem? Please describe.
We have a master zone and some satellite zones behind a vpn or firewall. In that cases the master couldn't receive traps.

Describe the solution you'd like
Whould be great to be abble to receibe these snmp traps in one satellite endpoing and sent the status to master

Describe alternatives you've considered
Forward snmp tramps from satellite to master

@patrickpr patrickpr self-assigned this May 13, 2020
@patrickpr patrickpr added the enhancement New feature or request label May 13, 2020
@patrickpr
Copy link
Owner

Hi,

It's a good feature, I will start working on it for next version.

Are you able to test this (my lab environment does not include master/sattellite setting) ?

@robdevops
Copy link
Contributor

I am about to do a multi-zone build, and will be able to test this in coming days/weeks.

@Copis
Copy link
Author

Copis commented Jun 9, 2020

I can test this scenario in my developement environment with one master/one statellite but i think should be better to test into ha environment with two masters/two satellites if it's possible.

@patrickpr
Copy link
Owner

For update : I'm currently building the test environement for this.

@patrickpr
Copy link
Owner

@Copis : architecture of satellites is work in progress.

Test environment : two masters in HA and two satellites in HA.

Traps can be received by :

  • master ( if there is a HA master using VRRP (keealived) IP)
  • satellite (if there is a HA sat, using VRRP too).

Satellite receives and process traps using configuration provided by masters and :

  • update database using a simple API provided by trapdirector module on masters.
  • Send passive service check results to satellites (or to master, this isn't decided yet).

For now, there is no zone for trap rules : they are global.

I assume :

  1. satellites can have access to master (and masterHA) on :
  • Icinga API port (5665 by default)
  • Icingaweb2 HTTP port (443)
    (Satellites will use a specific Icingaweb2 user)
  1. Master and master HA both have access to the trapdirector database.

  2. Latency between master(s) and sat(s) is low (<500ms)

I'm opened to comments and suggestions !

@Copis
Copy link
Author

Copis commented Sep 1, 2020

One of the problems that i see is in some scenarios cannot have VRRP for example in Active-Passive or Active-Active CPD with no extended vlans. In that case there are no posible implementation

@patrickpr patrickpr mentioned this issue Sep 1, 2020
@patrickpr
Copy link
Owner

Opened a topic here to talk about it : https://community.icinga.com/t/trapdirector-ha-feature/5439

@p4k8
Copy link
Contributor

p4k8 commented Sep 3, 2020

So here are some thoughts about it:

  1. As long as all instances of trap director talk to the same DB, it shouldn't matter how many there are.
  2. Traps can be forwarded from any nodes they can be received to any snmptrapd on trapdirector nodes. This enables chaining them through firewalls to the nodes where they can be processed properly.
  3. When trapdirector processes trap, it sends result to API of satellite/master. Why not both in a configurable order? So if you send result to satellite and you don't like return or its unreachable, you resend it to master or another satellite.
  4. In this scenario you'd have to worry about deduplication of traps if you choose to do HA by trying to send traps to all existing trapdirector instances which don't know about each other but share DB. Maybe theres even some cheap way to discard duplicates which is better than DB lookup for last 5 seconds worth of traps to see if it was already processed by fellow trapdirectors.

@patrickpr
Copy link
Owner

1. As long as all instances of trap director talk to the same DB, it shouldn't matter how many there are.

Correct, but DB connexion may be impossible on distant sites.

2. Traps can be forwarded from any nodes they _can_ be received to any snmptrapd on trapdirector nodes. This enables chaining them through firewalls to the nodes where they can be processed properly.

Some kind of trap routing ? Not very easy to implement !!!

3. When trapdirector processes trap, it sends result to API of satellite/master. Why not both in a configurable order? So if you send result to satellite and you don't like return or its unreachable, you resend it to master or another satellite.

Yes : satellite then master or master only (maybe set this by zones ?)

4. In this scenario you'd have to worry about deduplication of traps if you choose to do HA by trying to send traps to all existing trapdirector instances which don't know about each other but share DB. Maybe theres even some cheap way to discard duplicates which is better than DB lookup for last 5 seconds worth of traps to see if it was already processed by fellow trapdirectors.

There is a special 'waiting' status in DB that was implemented for this kind of things.

@p4k8
Copy link
Contributor

p4k8 commented Sep 4, 2020

DB connexion may be impossible on distant sites

So thats why it might be sound idea not to make any trapdirectors on distant sites. Like
DB <--> trapdirector <--snmptrapd on trapdirector host <-- firewalls/networks/whatever <-- snmptrapd with forward directive on remote site
"HA" in this part is achieved by forwarding traps from remote host to several trapdirector destinations simultaneously and then each of the trapdirectors would have list of API endpoints to send check result to.
So that would mean getting trap at least once, and maximum as many as there are snmptrapd forward destinations. That's solved by deduping stuff I guess.

Some kind of trap routing

More like, just adding forward default <address> to snmptrapd.conf pointing at snmptrapd on proper trapdirector node.

maybe set this by zones

Not sure if it actually has to be zone-aware to work properly as long as the endpoint addresses are listed in the correct order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants