Re: [PATCH] Documentation: coding-style: don't encourage WARN*()

From: John Hubbard
Date: Thu Apr 18 2024 - 18:34:01 EST


On 4/18/24 9:14 AM, Eric Biggers wrote:
On Sun, Apr 14, 2024 at 12:08:50PM -0500, Alex Elder wrote:
diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst
index 9c7cf73473943..bce43b01721cb 100644
--- a/Documentation/process/coding-style.rst
+++ b/Documentation/process/coding-style.rst
@@ -1235,17 +1235,18 @@ example. Again: WARN*() must not be used for a condition that is expected
to trigger easily, for example, by user space actions. pr_warn_once() is a
possible alternative, if you need to notify the user of a problem.
-Do not worry about panic_on_warn users

This is still[1] good advice. panic_on_warn users have made a
trade-off that works for them, but that should not mean that
all of the valid cases for WARN*() suddenly disappear. In fact,
without the WARN*() calls, panic_on_warn is a no-op, as someone
else has already pointed out.


-**************************************
+The panic_on_warn kernel option
+********************************
-A few more words about panic_on_warn: Remember that ``panic_on_warn`` is an
-available kernel option, and that many users set this option. This is why
-there is a "Do not WARN lightly" writeup, above. However, the existence of
-panic_on_warn users is not a valid reason to avoid the judicious use
-WARN*(). That is because, whoever enables panic_on_warn has explicitly
-asked the kernel to crash if a WARN*() fires, and such users must be
-prepared to deal with the consequences of a system that is somewhat more
-likely to crash.
+Note that ``panic_on_warn`` is an available kernel option. If it is enabled,
+a WARN*() call whose condition holds leads to a kernel panic. Many users
+(including Android and many cloud providers) set this option, and this is
+why there is a "Do not WARN lightly" writeup, above.
+
+The existence of this option is not a valid reason to avoid the judicious
+use of warnings. There are other options: ``dev_warn*()`` and ``pr_warn*()``
+issue warnings but do **not** cause the kernel to crash. Use these if you
+want to prevent such panics.

Nacked-by: Eric Biggers <ebiggers@xxxxxxxxxx>

Yes. I agree with this NAK.


WARN*() are for recoverable assertions, i.e. situations where the condition
being true can only happen due to a kernel bug but where they can be recovered
from (unlike BUG*() which are for unrecoverable situations). The people who use
panic_on_warn *want* the kernel to crash when such a situation happens so that
the underlying issue can be discovered and fixed. That's the whole point.

Also, it's not true that "Android" sets this option. It might be the case that
some individual Android OEMs have decided to use it for some reason (they do
have the ability to customize their kernel command line, after all). It's
certainly not used by default or even recommended.

- Eric


[1] https://lore.kernel.org/lkml/0db131cf-013e-6f0e-c90b-5c1e840d869c@xxxxxxxxxx/T/#md6836102150ac1e6265569aad55a692e3af75f34

thanks,
--
John Hubbard
NVIDIA