Re: [PATCH] Documentation: coding-style: don't encourage WARN*()
From: Eric Biggers
Date: Thu Apr 18 2024 - 12:14:48 EST
On Sun, Apr 14, 2024 at 12:08:50PM -0500, Alex Elder wrote:
> diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst
> index 9c7cf73473943..bce43b01721cb 100644
> --- a/Documentation/process/coding-style.rst
> +++ b/Documentation/process/coding-style.rst
> @@ -1235,17 +1235,18 @@ example. Again: WARN*() must not be used for a condition that is expected
> to trigger easily, for example, by user space actions. pr_warn_once() is a
> possible alternative, if you need to notify the user of a problem.
>
> -Do not worry about panic_on_warn users
> -**************************************
> +The panic_on_warn kernel option
> +********************************
>
> -A few more words about panic_on_warn: Remember that ``panic_on_warn`` is an
> -available kernel option, and that many users set this option. This is why
> -there is a "Do not WARN lightly" writeup, above. However, the existence of
> -panic_on_warn users is not a valid reason to avoid the judicious use
> -WARN*(). That is because, whoever enables panic_on_warn has explicitly
> -asked the kernel to crash if a WARN*() fires, and such users must be
> -prepared to deal with the consequences of a system that is somewhat more
> -likely to crash.
> +Note that ``panic_on_warn`` is an available kernel option. If it is enabled,
> +a WARN*() call whose condition holds leads to a kernel panic. Many users
> +(including Android and many cloud providers) set this option, and this is
> +why there is a "Do not WARN lightly" writeup, above.
> +
> +The existence of this option is not a valid reason to avoid the judicious
> +use of warnings. There are other options: ``dev_warn*()`` and ``pr_warn*()``
> +issue warnings but do **not** cause the kernel to crash. Use these if you
> +want to prevent such panics.
>
Nacked-by: Eric Biggers <ebiggers@xxxxxxxxxx>
WARN*() are for recoverable assertions, i.e. situations where the condition
being true can only happen due to a kernel bug but where they can be recovered
from (unlike BUG*() which are for unrecoverable situations). The people who use
panic_on_warn *want* the kernel to crash when such a situation happens so that
the underlying issue can be discovered and fixed. That's the whole point.
Also, it's not true that "Android" sets this option. It might be the case that
some individual Android OEMs have decided to use it for some reason (they do
have the ability to customize their kernel command line, after all). It's
certainly not used by default or even recommended.
- Eric