-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
criticalthreshold not respected? #81
Comments
Potential solutionThe plan to solve the bug involves verifying and correcting the parsing and handling of the What is causing this bug?The bug is likely caused by either incorrect parsing of the CodeTo address this issue, we need to:
Step 1: Verify ParsingEnsure that the # check_logfiles.pl
# Add debugging statement after parsing command-line options
print STDERR "Parsed criticalthreshold: $commandline{criticalthreshold}\n" if exists $commandline{criticalthreshold};
# Ensure criticalthreshold is included in the options passed to Nagios::CheckLogfiles
if (my $cl = Nagios::CheckLogfiles->new({
...
options => join(',', grep { $_ }
...
$commandline{criticalthreshold} ? "criticalthreshold=".$commandline{criticalthreshold} : undef,
...
),
...
})) {
...
} Step 2: Ensure Correct ApplicationReview and correct the logic within the # Nagios/CheckLogfiles.pm
# Add debugging statement to trace the application of criticalthreshold
sub check_thresholds {
my ($self, $count) = @_;
print STDERR "Checking thresholds with count: $count and criticalthreshold: $self->{criticalthreshold}\n";
if ($count >= $self->{criticalthreshold}) {
return 'CRITICAL';
} elsif ($count >= $self->{warningthreshold}) {
return 'WARNING';
} else {
return 'OK';
}
} Step 3: Add Detailed Debugging StatementsAdd more detailed debugging statements to trace the internal states and threshold counts more precisely. # Nagios/CheckLogfiles.pm
# Add debugging statements around threshold checks
sub analyze_logfile {
my ($self, $logfile) = @_;
my $count = 0;
while (my $line = <$logfile>) {
if ($line =~ /$self->{criticalpattern}/) {
$count++;
}
}
print STDERR "Total critical pattern matches: $count\n";
return $self->check_thresholds($count);
} How to replicate the bugTo replicate the bug, follow these steps:
Example configuration file ( $seekfilesdir = '/var/tmp/check_logfiles';
$protocolsdir = '/var/tmp/check_logfiles';
$scriptpath = '/usr/lib64/nagios/plugins';
@searches = (
{
tag => 'icinga2_client_handshake_errors',
logfile => '/var/log/icinga2/icinga2.log',
criticalpatterns => [
'Client TLS handshake failed'
],
options => 'noprotocol,nosticky,nosavethresholdcount,nosavestate,criticalthreshold=10,warningthreshold=5,maxage=15m',
}
); Command to run the plugin: '/usr/bin/sudo' '/usr/lib64/nagios/plugins/check_logfiles' '--config' '/etc/nagios/logfile_icinga.cfg' '--tag' 'icinga2_client_handshake_errors' By following these steps, you should be able to observe the bug and verify that the solution correctly addresses the issue. Click here to create a Pull Request with the proposed solution Files used for this task: Changes on t/007threshold.tAnalysis Report for
|
Threshold option works correctly. The log file actually contains a large number of occurrences of found patterns.
Is there a way that the output shows the actual number of errors (34) instead of 3? |
Seen a weird problem today where
check_logfiles
correctly identifies error patterns in a log file.The config file sets multiple options, including
criticalthreshold=10
, yet the plugin reports a CRITICAL status when finding a number of error lines below the threshold.Config file:
Command line usage would be:
'/usr/bin/sudo' '/usr/lib64/nagios/plugins/check_logfiles' '--config' '/etc/nagios/logfile_icinga.cfg' '--tag' 'icinga2_client_handshake_errors'
.The Icinga2 alert history shows that the status of this service check switches to critical already after finding just a single error line within the run.
To my understanding this should only be the case if 10 or more error lines were found for this run? Or am I misunderstanding something or potentially breaking things with one of the other options?
The text was updated successfully, but these errors were encountered: