Re: [PATCH] audit: add backlog high water mark metric
From: Paul Moore
Date: Thu Apr 16 2026 - 16:53:09 EST
On Thu, Apr 16, 2026 at 4:33 PM Steve Grubb <sgrubb@xxxxxxxxxx> wrote:
> On Wednesday, April 15, 2026 11:21:52 AM Eastern Daylight Time Paul Moore
> wrote:
> > On Wed, Apr 15, 2026 at 11:19 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > > On Tue, Apr 14, 2026 at 11:45 PM Steve Grubb <sgrubb@xxxxxxxxxx> wrote:
> > > > On Friday, April 10, 2026 5:34:08 PM Eastern Daylight Time Paul Moore
> wrote:
> > > > > On Mon, Mar 23, 2026 at 11:07 AM Ricardo Robaina
> > > > > <rrobaina@xxxxxxxxxx>
> > > >
> > > > wrote:
> > > ...
> > >
> > > > ... compliance-driven systems that must use a finite backlog limit for
> > > > memory safety but cannot tolerate dropped events ...>
> > > You must pick one of those two requirements, or at the very least
> > > prioritize them; it is simply impossible to both limit the backlog
> > > queue and require zero dropped events.
> >
> > To be perfectly honest, it's also impossible to require zero dropped
> > events. Even in the most extreme configurations where the admin
> > decides to panic the system, that only happens once the system reaches
> > the point where it is dropping events. We try *really* hard to not
> > drop events, but it is always going to be a possibility.
>
> You're helping make the point. Those administrators have decided reliable
> auditing is more important than system availability.
Users prioritizing reliable auditing over system availability should
not run with a backlog limit. It's that simple.
Regardless, I'm still not convinced this maximum backlog stat alone
will solve any meaningful problems. If your audit log is predictable
enough that this metric has value, it should be possible to either
capture the backlog size during periods of high audit load or simply
run the system through that load and verify it doesn't crash and go to
hell. If your audit log isn't predictable, capturing a maximum
backlog size doesn't really mean anything since it is still a snapshot
of one instance of the system and there is always the possibility of
the system exceeding it.
--
paul-moore.com