[PLUG] root "pkill: killing pid * failed: Operation not permitted"

JP Vossen via plug on 7 Jun 2024 13:07:42 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] root "pkill: killing pid * failed: Operation not permitted"

From: JP Vossen via plug <plug@lists.phillylinux.org>
To: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Subject: [PLUG] root "pkill: killing pid * failed: Operation not permitted"
Date: Fri, 7 Jun 2024 16:07:36 -0400
Reply-to: JP Vossen <jp@jpsdomain.org>
Sender: "plug" <plug-bounces@lists.phillylinux.org>
User-agent: Mozilla Thunderbird

What could cause "pkill: killing pid * failed: Operation not permitted" *when run by root*?

After patching and reboots the other day I started getting daily Anacron emails from Logrotate on most (but not all) of 50+ VMs saying:
```
/etc/cron.daily/logrotate:
pkill: killing pid NNN failed: Operation not permitted
```

The culprit is the (quite horrible, but mandatory) Crowdstrike `falcon-agent` service, running from the stock vendor RPM that has not changed since April, and we've had patching reboots since then.

The really confusing thing is that *most* of them are doing this, but *not* all, and I can't find any differences! The 50+ VMs are a mix of (quite horrible, but mandatory) Oracle Linux 7.9 (EoL soon, thus migrating) and 8.10, but the problem doesn't follow the distro. Also, a few of the ones that complained on Wed did not complain on Thu, so they "fixed" themselves?

When I manually run the relevant line from `/etc/logrotate.d/falcon-sensor` *as root*, it either silently works or fails with the error above, according to whether I get the Anacron emails or not. So it's not that `/usr/bin/pkill -HUP falcon-sensor` is a problem, and it is running as root. It just...sometimes works and sometimes doesn't.

The `falcon-sensor` process itself is running as root, as is the parent `falcond`. Restarting via `systemctl restart falcon-sensor` doesn't help, neither does a stop then start. Every VM has the same `falcon-sensor-7.01.0-15604.el7.x86_64` or `falcon-sensor-7.01.0-15604.el8.x86_64` and both vendor RPMs have identical *stock* `/etc/logrotate.d/falcon-sensor` and `/usr/lib/systemd/system/falcon-sensor.service` files.

`/usr/bin/pkill` is also the same on working and broken servers, and SELinux is disabled. They all have the same kernel (current for either OEL-7 or 8) and it is *not* UEK (the Oracle Unusable Enterprise Kernel that always ends in tears). They all have plenty of free disk space.

This is over-simplified (and skipping some related nodes), but I have 5 groups of 8 identical "agent" VMs, and for 4 groups 7 fail and 1 works. The 5th group only had 3 bad out of 8, then that "fixed" itself.

Clues? Thanks,
JP
-- -------------------------------------------------------------------
JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/
___________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

Follow-Ups:
- Re: [PLUG] root "pkill: killing pid * failed: Operation not permitted"
  - From: Steve Litt via plug <plug@lists.phillylinux.org>
- Re: [PLUG] root "pkill: killing pid * failed: Operation not permitted"
  - From: JP Vossen via plug <plug@lists.phillylinux.org>

Prev by Date: Re: [PLUG] [plug-announce] Wed June 5 - PLUG Central - "General Linux Discussion" (7pm EDT online)
Next by Date: Re: [PLUG] root "pkill: killing pid * failed: Operation not permitted"
Previous by thread: Re: [PLUG] [plug-announce] Wed June 5 - PLUG Central - "General Linux Discussion" (7pm EDT online)
Next by thread: Re: [PLUG] root "pkill: killing pid * failed: Operation not permitted"
Index(es):
- Date
- Thread