Re: [PATCH] Documentation: coding-style: don't encourage WARN*()

From: Jason Gunthorpe
Date: Thu Apr 18 2024 - 11:57:42 EST


On Mon, Apr 15, 2024 at 09:26:40AM -0700, Kees Cook wrote:
> On Mon, Apr 15, 2024 at 10:35:21AM +0200, Greg KH wrote:
> > On Mon, Apr 15, 2024 at 01:07:41AM -0700, Christoph Hellwig wrote:
> > > No, this advice is wronger than wrong. If you set panic_on_warn you
> > > get to keep the pieces.
> > >
> >
> > But don't add new WARN() calls please, just properly clean up and handle
> > the error. And any WARN() that userspace can trigger ends up triggering
> > syzbot reports which also is a major pain, even if you don't have
> > panic_on_warn enabled.
>
> Here's what was more recently written on WARN:
>
> https://docs.kernel.org/process/deprecated.html#bug-and-bug-on
>
> Specifically:
>
> - never use BUG*()
> - WARN*() should only be used for "expected to be unreachable" situations
>
> This, then, maps correctly to panic_on_warn: System owners may have set
> the panic_on_warn sysctl, to make sure their systems do not continue
> running in the face of "unreachable" conditions.
>
> As in, userspace should _never_ be able to reach a WARN(). If it can,
> either the logic leading to it needs to be fixed, or the WARN() needs to
> be changed to a pr_warn().

Exactly! No doubt there are mistakes, but we already document this too
a few lines above where this is touching:

Do not WARN lightly
*******************

WARN*() is intended for unexpected, this-should-never-happen situations.
WARN*() macros are not to be used for anything that is expected to happen
during normal operation. These are not pre- or post-condition asserts, for
example. Again: WARN*() must not be used for a condition that is expected
to trigger easily, for example, by user space actions. pr_warn_once() is a
possible alternative, if you need to notify the user of a problem.

Usages following that advice should be left alone and more should be
added. Invariant checks that indicate the kernel is malfunctioning are
desirable things to have!

Yes, by all means tell people to follow the above rules! But that
isn't a ban on WARN and shouldn't be communicated as "don't add new
WARN() calls please".

Let's all keep in mind that fuzzing reports are incredibly valuable to
make the kernel more secure and robust. We actually want *more*
invariants that indicate bugs for the fuzzers to trip up on!

As above, a correctly used WARN, should indicate a certain bug if it
triggers.

I'd guess about 30-40% of the syzkaller found bugs I've delt with are
from a correct use of WARN_ON not oops/kasn/etc. I wonder what a
datamine on the whole syzkaller database would indicate.

pr_warn/etc don't trigger fuzzer faults, and don't give a debugging
backtrace.

I also find it strange to want panic_on_warn to exist, and people want
to use it, while also saying that the WARN() calls that actually make
it do something and be valuable are forbidden :(

Jason