check_logfiles on /var/log/message resulting in frequent socket timeouts #59

ChristopherP1221 · 2020-10-19T19:01:48Z

Hello,

I'm looking for some guidance as this issue has been plaguing me for a little while now and I'm almost positive it's related to something I'm doing inefficiently.

I am using the "check_logfiles" plugin against my syslog located at /var/log/messages. I wanted the granularity of defining different properties and thresholds for different patterns so I am choosing to use different .cfg patterns and different nagios service checks. I have been receiving many socket timeouts from these service checks. They are not constant and happen on different hosts but it occurs all day long intermittently on different servers

It should be noted, there are also unrelated checks that are not exhibiting the same "socket timeout" behavior.

Here are the config files in question:

check_logfiles_messages_qla_critical.cfg
@searches = (
{
tag => 'critical qla',
logfile => '/var/log/messages',
criticalpatterns => 'Abort command issued nexus',
options => "criticalthreshold=15",
},
);

check_logfiles_messages_qla_warning.cfg
@searches = (
{
tag => 'warning qla',
logfile => '/var/log/messages',
warningpatterns => ['QUEUE FULL detected', 'FCPort state transitioned from'],
options => "warningthreshold=8",
},
);

Other examples that seem to run just fine (no intermittent socket timeouts)...
@searches = (
{
tag => 'lpfc',
logfile => '/var/log/messages',
criticalpatterns => 'kernel: lpfc',
},
);

Below is how the nagios command is being issued, sudoers has already been configured, I recently added the --rununique flag to see if that would help, it hasn't. Any help/guidance/insight into what this plugin is doing that I might be overlooking would be extremely helpful! For example, I know that a temporary index file gets created, is it possible that several of these index files are being created and conflicting with each other or somehow confusing the script?

/usr/bin/sudo /usr/lib64/nagios/plugins/check_logfiles --rununique -f /etc/nagios/plugins/check_logfiles_messages_qla_critical.cfg

lausser · 2020-10-19T19:08:16Z

Socket timeout might be a dns problem, which is out of the scope of this plugin. As I don’t see the error message, I can only speculate. Check_logfiles finds out the hostname of the server it is running on. And this is the only place where I can imagine sockets to be involved. You might consider running the nscd. Von: ChristopherP1221 [mailto:[email protected]] Gesendet: Montag, 19. Oktober 2020 21:02 An: lausser/check_logfiles <[email protected]> Cc: Subscribed <[email protected]> Betreff: [lausser/check_logfiles] check_logfiles on /var/log/message resulting in frequent socket timeouts (#59) Hello, I'm looking for some guidance as this issue has been plaguing me for a little while now and I'm almost positive it's related to something I'm doing inefficiently. I am using the "check_logfiles" plugin against my syslog located at /var/log/messages. I wanted the granularity of defining different properties and thresholds for different patterns so I am choosing to use different .cfg patterns and different nagios service checks. I have been receiving many socket timeouts from these service checks. They are not constant and happen on different hosts but it occurs all day long intermittently on different servers It should be noted, there are also unrelated checks that are not exhibiting the same "socket timeout" behavior. Here are the config files in question: check_logfiles_messages_qla_critical.cfg @searches <https://github.com/searches> = ( { tag => 'critical qla', logfile => '/var/log/messages', criticalpatterns => 'Abort command issued nexus', options => "criticalthreshold=15", }, ); check_logfiles_messages_qla_warning.cfg @searches <https://github.com/searches> = ( { tag => 'warning qla', logfile => '/var/log/messages', warningpatterns => ['QUEUE FULL detected', 'FCPort state transitioned from'], options => "warningthreshold=8", }, ); Other examples that seem to run just fine (no intermittent socket timeouts)... @searches <https://github.com/searches> = ( { tag => 'lpfc', logfile => '/var/log/messages', criticalpatterns => 'kernel: lpfc', }, ); Below is how the nagios command is being issued, sudoers has already been configured, I recently added the --rununique flag to see if that would help, it hasn't. Any help/guidance/insight into what this plugin is doing that I might be overlooking would be extremely helpful! For example, I know that a temporary index file gets created, is it possible that several of these index files are being created and conflicting with each other or somehow confusing the script? /usr/bin/sudo /usr/lib64/nagios/plugins/check_logfiles --rununique -f /etc/nagios/plugins/check_logfiles_messages_qla_critical.cfg — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#59> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABQSOC2KX2BSQKYVHYCK4LSLSEK3ANCNFSM4SWRIBFQ> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

check_logfiles on /var/log/message resulting in frequent socket timeouts #59

check_logfiles on /var/log/message resulting in frequent socket timeouts #59

ChristopherP1221 commented Oct 19, 2020

lausser commented Oct 19, 2020 via email

check_logfiles on /var/log/message resulting in frequent socket timeouts #59

check_logfiles on /var/log/message resulting in frequent socket timeouts #59

Comments

ChristopherP1221 commented Oct 19, 2020

lausser commented Oct 19, 2020 via email