[PATCH] include/asm-generic/bug.h: clarify valid uses of WARN()

From: Dmitry Vyukov
Date: Wed Jun 20 2018 - 06:37:30 EST


From: Dmitry Vyukov <dvyukov@xxxxxxxxxx>

Explicitly state that WARN*() should be used only for recoverable
kernel issues/bugs and that it should not be used for any kind of
invalid external inputs or transient conditions.

Motivation: it's a very useful capability to be able to understand
if a particular kernel splat means a kernel bug or simply an invalid
user-space program. For the former one wants to notify kernel developers,
while notifying kernel developers for the latter is annoying.
Even a kernel developer may not know what to do with a WARNING
in an unfamiliar subsystem. This is especially critical for any automated
testing systems that may use panic_on_warn and mail kernel developers.

The clear separation also serves as an additional documentation:
is it a condition that must never occur because of additional
checks/logic elsewhere? or is it simply a check for invalid inputs
or unfortunate conditions?

Use of pr_err() for user messages also leads to better error messages.
"Something is wrong in file foo on line X" is not particularly useful
message for end user. pr_err() forces developers to write more meaningful
error messages for user.

As of now we are almost there. We are doing systematic kernel testing
with panic_on_warn and are not seeing massive amounts of false positives.
But every now and then another WARN on ENOMEM or invalid inputs pops up
and leads to a lengthy argument each time. The goal of this change
is to officially document the rules.

Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
---
include/asm-generic/bug.h | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index a7613e1b0c87..20561a60db9c 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -75,9 +75,19 @@ struct bug_entry {

/*
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
- * significant issues that need prompt attention if they should ever
- * appear at runtime. Use the versions with printk format strings
- * to provide better diagnostics.
+ * significant kernel issues that need prompt attention if they should ever
+ * appear at runtime.
+ *
+ * Do not use these macros when checking for invalid external inputs
+ * (e.g. invalid system call arguments, or invalid data coming from
+ * network/devices), and on transient conditions like ENOMEM or EAGAIN.
+ * These macros should be used for recoverable kernel issues only.
+ * For invalid external inputs, transient conditions, etc use
+ * pr_err[_once/_ratelimited]() followed by dump_stack(), if necessary.
+ * Do not include "BUG"/"WARNING" in format strings manually to make these
+ * conditions distinguishable from kernel issues.
+ *
+ * Use the versions with printk format strings to provide better diagnostics.
*/
#ifndef __WARN_TAINT
extern __printf(3, 4)
--
2.18.0.rc1.244.gcf134e6275-goog