nagios-nrpe-server-4 is creating incorrect log spam

Bug #1888184 reported by JP Vossen
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
nagios-nrpe (Debian)
New
Undecided
Unassigned
nagios-nrpe (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

Server: Debian 10 (buster), nagios-nrpe-plugin-3.2.1-2
Node: Linux Mint 20 (Ulyana), using Ubuntu Focal, nagios-nrpe-server-4.0.0-2ubuntu1

Icinga1 server on Debian monitoring a bunch of LinuxMint-19.3, using protocol v3, and working just fine since last year. I just built 2x new install Mint-20.0 nodes and on *both* of them, for every single `check_nrpe` from the server to the node, the node spams its logs with:
* Jul 19 22:02:56 node nrpe[13152]: Error: (use_ssl == true): Request packet version was invalid!
* Jul 19 22:02:56 node nrpe[13152]: Could not read request from client 192.168.1.11, bailing out...
* Jul 19 22:02:56 node nrpe[13152]: INFO: SSL Socket Shutdown.

And the server spams its logs with:
* Jul 19 22:02:05 server check_nrpe: Remote 192.168.1.120 does not support Version 3 Packets

Yet despite all of that noise, the checks actually work just fine.

Expected: the checks using nrpe-v3 should Just Work with no incorrect log spam
What happens: the nrpe-v3 checks work but spam the logs on both sides

Nice-to-have: any clues on how to cut down the log spam until the fix is released.
Using `/etc/nagios/nrpe.d/node.cfg` to set `debug=1` or `debug=0` works to turn debugs logs on and off, but that has no effect on the rest of the spam.

Server side sanity checks:
```
$ /usr/lib/nagios/plugins/check_nrpe -H node-mint20.0 -n
CHECK_NRPE: Receive header underflow - only -1 bytes received (4 expected).

$ /usr/lib/nagios/plugins/check_nrpe -H node-mint20.0 -2 # (NO LOG SPAM)
NRPE v4.0.0

$ /usr/lib/nagios/plugins/check_nrpe -H node-mint20.0 # (v3 STILL WORKS, but with log spam)
NRPE v4.0.0

$ /usr/lib/nagios/plugins/check_nrpe -H node-mint19.3
NRPE v3.2.1
```

SERVER side test run, with deliberate errors to bracket the logs:
```
[root@drake:T1:L1:C4975:J0:2020-07-20_01:02:04_EDT]
/root# /usr/lib/nagios/plugins/check_nrpe -H node -c check_load_start
NRPE: Command 'check_load_start' not defined

[root@drake:T1:L1:C4976:J0:2020-07-20_01:02:07_EDT]
/root# /usr/lib/nagios/plugins/check_nrpe -H node -c check_load
OK - load average: 0.33, 0.14, 0.05|load1=0.330;3.000;5.000;0; load5=0.140;2.000;4.000;0; load15=0.050;2.000;4.000;0;

[root@drake:T1:L1:C4976:J0:2020-07-20_01:02:12_EDT]
/root# /usr/lib/nagios/plugins/check_nrpe -H node -c check_end
NRPE: Command 'check_end' not defined
```

NODE side debug log, notes the errors, note the LOG SPAM, note the check still runs:
```
Jul 20 01:01:52 node systemd[1]: nagios-nrpe-server.service: Succeeded.
Jul 20 01:01:52 node nrpe[18424]: Added command[check_ntp_peer]=/usr/lib/nagios/plugins/check_ntp_peer -H localhost
Jul 20 01:01:52 node nrpe[18424]: Added command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 325 -c 360
Jul 20 01:01:52 node nrpe[18424]: Added command[check_disks]=/usr/lib/nagios/plugins/check_disk -w 15% -c 8% -l -X tmpfs -X udev -X usbfs -X fuse -X fuse.sshfs -X fuse.gvfsd-fuse -X fuse.gvfs-fuse-daemon -X squashfs -x /run/docker/netns/default
Jul 20 01:01:52 node nrpe[18424]: Added command[check_load]=/usr/lib/nagios/plugins/check_load -w 3,2,2 -c 5,4,4
Jul 20 01:01:52 node nrpe[18424]: INFO: SSL/TLS initialized. All network traffic will be encrypted.
Jul 20 01:01:52 node nrpe[18424]: Starting up daemon
Jul 20 01:01:52 node nrpe[18424]: SETUP_WAIT_CONN FOR: IPv4 address: 0.0.0.0 ((null))
Jul 20 01:01:52 node nrpe[18424]: Server listening on 0.0.0.0 port 5666.
Jul 20 01:01:52 node nrpe[18424]: SETUP_WAIT_CONN FOR: IPv4 address: :: ((null))
Jul 20 01:01:52 node nrpe[18424]: Server listening on :: port 5666.
Jul 20 01:01:52 node nrpe[18424]: Listening for connections on port 5666
Jul 20 01:01:52 node nrpe[18424]: Allowing connections from: 127.0.0.1,192.168.1.11
Jul 20 01:02:07 node nrpe[19664]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 12975
Jul 20 01:02:07 node nrpe[19664]: Connection from 192.168.1.11 port 12975
Jul 20 01:02:07 node nrpe[19664]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:07 node nrpe[19664]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:07 node nrpe[19664]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:07 node nrpe[19664]: Host address is in allowed_hosts
Jul 20 01:02:07 node nrpe[19664]: Error: (use_ssl == true): Request packet version was invalid!
Jul 20 01:02:07 node nrpe[19664]: Could not read request from client 192.168.1.11, bailing out...
Jul 20 01:02:07 node nrpe[19664]: INFO: SSL Socket Shutdown.
Jul 20 01:02:07 node nrpe[19664]: Connection from 192.168.1.11 closed.
Jul 20 01:02:07 node nrpe[19671]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 13487
Jul 20 01:02:07 node nrpe[19671]: Connection from 192.168.1.11 port 13487
Jul 20 01:02:07 node nrpe[19671]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:07 node nrpe[19671]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:07 node nrpe[19671]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:07 node nrpe[19671]: Host address is in allowed_hosts
Jul 20 01:02:07 node nrpe[19671]: Host 192.168.1.11 is asking for command 'check_load_start' to be run...
Jul 20 01:02:07 node nrpe[19671]: NRPE: Command 'check_load_start' not defined
Jul 20 01:02:07 node nrpe[19671]: Return Code: 3, Output: NRPE: Command 'check_load_start' not defined
Jul 20 01:02:07 node nrpe[19671]: Connection from 192.168.1.11 closed.
Jul 20 01:02:11 node nrpe[19673]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 14511
Jul 20 01:02:11 node nrpe[19673]: Connection from 192.168.1.11 port 14511
Jul 20 01:02:11 node nrpe[19673]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:11 node nrpe[19673]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:11 node nrpe[19673]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:11 node nrpe[19673]: Host address is in allowed_hosts
Jul 20 01:02:12 node nrpe[19673]: Error: (use_ssl == true): Request packet version was invalid!
Jul 20 01:02:12 node nrpe[19673]: Could not read request from client 192.168.1.11, bailing out...
Jul 20 01:02:12 node nrpe[19673]: INFO: SSL Socket Shutdown.
Jul 20 01:02:12 node nrpe[19673]: Connection from 192.168.1.11 closed.
Jul 20 01:02:12 node nrpe[19675]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 15023
Jul 20 01:02:12 node nrpe[19675]: Connection from 192.168.1.11 port 15023
Jul 20 01:02:12 node nrpe[19675]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:12 node nrpe[19675]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:12 node nrpe[19675]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:12 node nrpe[19675]: Host address is in allowed_hosts
Jul 20 01:02:12 node nrpe[19675]: Host 192.168.1.11 is asking for command 'check_load' to be run...
Jul 20 01:02:12 node nrpe[19675]: Running command: /usr/lib/nagios/plugins/check_load -w 3,2,2 -c 5,4,4
Jul 20 01:02:12 node nrpe[19676]: WARNING: my_system() seteuid(0): Operation not permitted
Jul 20 01:02:12 node nrpe[19675]: Command completed with return code 0 and output: OK - load average: 0.33, 0.14, 0.05|load1=0.330;3.000;5.000;0; load5=0.140;2.000;4.000;0; load15=0.050;2.000;4.000;0;
Jul 20 01:02:12 node nrpe[19675]: Return Code: 0, Output: OK - load average: 0.33, 0.14, 0.05|load1=0.330;3.000;5.000;0; load5=0.140;2.000;4.000;0; load15=0.050;2.000;4.000;0;
Jul 20 01:02:12 node nrpe[19675]: Connection from 192.168.1.11 closed.
Jul 20 01:02:15 node nrpe[19684]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 16047
Jul 20 01:02:15 node nrpe[19684]: Connection from 192.168.1.11 port 16047
Jul 20 01:02:15 node nrpe[19684]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:15 node nrpe[19684]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:15 node nrpe[19684]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:15 node nrpe[19684]: Host address is in allowed_hosts
Jul 20 01:02:15 node nrpe[19684]: Error: (use_ssl == true): Request packet version was invalid!
Jul 20 01:02:15 node nrpe[19684]: Could not read request from client 192.168.1.11, bailing out...
Jul 20 01:02:15 node nrpe[19684]: INFO: SSL Socket Shutdown.
Jul 20 01:02:15 node nrpe[19684]: Connection from 192.168.1.11 closed.
Jul 20 01:02:15 node nrpe[19686]: CONN_CHECK_PEER: checking if host is allowed: 192.168.1.11 port 16559
Jul 20 01:02:15 node nrpe[19686]: Connection from 192.168.1.11 port 16559
Jul 20 01:02:15 node nrpe[19686]: is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<
Jul 20 01:02:15 node nrpe[19686]: message repeated 2 times: [ is_an_allowed_host (AF_INET): is host >192.168.1.11< an allowed host >192.168.1.11<]
Jul 20 01:02:15 node nrpe[19686]: is_an_allowed_host (AF_INET): host is in allowed host list!
Jul 20 01:02:15 node nrpe[19686]: Host address is in allowed_hosts
Jul 20 01:02:15 node nrpe[19686]: Host 192.168.1.11 is asking for command 'check_end' to be run...
Jul 20 01:02:15 node nrpe[19686]: NRPE: Command 'check_end' not defined
Jul 20 01:02:15 node nrpe[19686]: Return Code: 3, Output: NRPE: Command 'check_end' not defined
Jul 20 01:02:15 node nrpe[19686]: Connection from 192.168.1.11 closed.
```

Tags: bseng-29
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Thanks for taking the time to report this bug and trying to make Ubuntu better.

I've seen some people facing issues regarding the packet version compatibility, you have version 3 and 4 in your setup. I tried to make a quick research about any compatibility issue between version 3 and 4 but I did not find something useful. I found a recent post on the nagios support forum but I do not have permission to access it, from its short description I got that "agents are falling back to packet version 2 when query an agent in version 4 by a plugin in version 3". This seems to match your case and it might be an upstream issue. It'd be great if you could engage in a discussion about it with upstream. In case you do that please link it here.

Since you described it is still working, you can try to follow one of the solutions described in this article to silence those errors for now:

https://support.nagios.com/kb/article/nrpe-remote-does-not-support-version-3-packets-786.html

Changed in nagios-nrpe (Ubuntu):
status: New → Triaged
Revision history for this message
JP Vossen (jp-jpsdomain) wrote :

Thanks for the clues! I think I'd seen that URL in my research but the `rsyslog` filter had escaped me. I'll see about pestering upstream when time permits, any hints on where or who to pester appreciated.

For the record, for Mint-20 or Ubuntu Focal using stock `rsyslog` this prevented the log spam but isn't a real fix:
```
sudo vi /etc/rsyslog.d/10-nrpe-log-spam.conf
sudo systemctl restart rsyslog
```

File contents to "fix" NODE spam:
```
# /etc/rsyslog.d/10-nrpe-log-spam.conf
# 2020-07-26 Sun: Suppress useless NRPE log spam
# See https://bugs.launchpad.net/ubuntu/+source/nagios-nrpe/+bug/1888184
:msg, contains, "Request packet version was invalid" stop
:msg, contains, "Could not read request from client 192.168.1.11, bailing out" stop
:msg, contains, "INFO: SSL Socket Shutdown" stop
```

File contents to "fix" Server spam:
```
# /etc/rsyslog.d/10-nrpe-log-spam.conf
# 2020-07-26 Sun: Suppress useless NRPE log spam
# See https://bugs.launchpad.net/ubuntu/+source/nagios-nrpe/+bug/1888184
:msg, contains, "does not support Version 3 Packets" stop
```

Revision history for this message
Ken Marsh (kmarsh) wrote :

This is a Focal v4 check_nrpe querying bionic NRPE v3 and should work without spamming logs. I am seeing in syslog:

check_nrpe - - - Remote 10.1.2.3 does not support version 3/4 packets

However checks are working.

Revision history for this message
Christian Clifford (cmakaloclifford) wrote :

We have similar issue but with the same version on both sides.

The checks actually works (it seems), but generates a lot of these errors shown below.

### On a server being checked ###

# Installed packages:
nagios-nrpe-server/focal,focal,now 4.0.0-2ubuntu1 amd64 [installed]

# Output from syslog:
2021-09-30 15:50:22 INFO: SSL Socket Shutdown.
2021-09-30 15:50:22 Could not read request from client *IP*, bailing out...
2021-09-30 15:50:22 Error: (use_ssl == true): Request packet version was invalid!

### On the monitor server ###

# Installed packages:
nagios-nrpe-plugin/focal,now 4.0.0-2ubuntu1 amd64 [installed]
nagios-nrpe-server/focal,now 4.0.0-2ubuntu1 amd64 [installed]

# Output from syslog:
2021-09-30 15:50:22 Remote *IP* does not support version 3/4 packets

no longer affects: nagios-nrpe (Debian)
Revision history for this message
Bill McGonigle (bill-launchpad-net) wrote :

> I found a recent post on the nagios support forum but I do not have permission to access it, from its short description I got that "agents are falling back to packet version 2 when query an agent in version 4 by a plugin in version 3"

Googlebot can see it. From there:

"The way the NRPE plugin negotiates the SSL version of the agent does generate a lot of messages in the remote server but if the checks does work, you can disable the message by changing the following option in the nrpe.cfg file to a 1. That should stop all logging for the agent."

It's quite dissatisfying but it seems to say that the plugin's *version probing logic* causes invalid error messages on the agent, and if you don't like that you should have rsyslog filter out the invalid error messages. Not how I would structure the protocol but elsewhere it's said that commercial nagios is moving to HTTP and deprecating nrpe protocol so it's unlikely to be fixed.

I wiresharked this for a bit and it does appear that *no* cleartext traffic is transiting the link. e.g.:

07:14:24.538748 IP 192.168.1.246.39798 > 192.168.1.43.5666: Flags [P.], seq 836:1901, ack 826, win 506, options [nop,nop,TS val 10310092 ecr 4249413257], length 1065
E..].g@.?.i.
.
.
..+.v."
.&.e1.L....W......
..Q..H......$..H ..h...&}...hk.!%./...z..#"..Gq.g...z!Bx..j].m<.0w'C.s..A..P&WFTci.9.. |.......E6.d......?..d.5..5...MU..:0.v./...\S.....&.X.?);

So I believe it is not falling back to v2 and is actually using v3 with ssl. One suggestion says to add '-2' to the check_nrpe command but that seems like really bad advice to me (at least for anywhere where network security is a consideration).

my config:

ii nagios-nrpe-plugin 3.2.1-2 amd64 Nagios Remote Plugin Executor Plugin
ii nagios-nrpe-server 4.0.3-1 amd64 Nagios Remote Plugin Executor Server

So it appears that upstream is saying to do a workaround and it's wontfix on their end. Disappointing but not surprising.

Alvaro Uria (aluria)
tags: added: bseng-29
Revision history for this message
Christian Ehrhardt  (paelzer) wrote (last edit ):

Hi,
coming around cleaning older cases ...

Thanks everyone for the discussion!
On one hand I'm glad we managed to collect a bunch of log suppression and configuration workarounds. On the other hand it is sad that this seems to be mostly just "as it is" with no great way out (e.g. a fix in a later version one could try to backport).

After Bill outlined - out of the upstream Forum - that it is a known issue with likely no fix I'd say until there is any further insight how to address this in a package update there isn't much one can act on.

The bug will stay around and be helpful for anyone searching for the error message, but I'll set the task to incomplete until there is any change to the overall situation that allows to act and improve it.

Changed in nagios-nrpe (Ubuntu):
status: Triaged → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.