[PATCH v2 0/2] Introduce the pkill_on_warn parameter

From: Alexander Popov
Date: Wed Oct 27 2021 - 19:32:36 EST

Next message: Alexander Popov: "[PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action"
Previous message: Chun-Kuang Hu: "Re: [PATCH v5 5/6] drm/mediatek: Add mbox_free_channel in mtk_drm_crtc_destroy"
Next in thread: Alexander Popov: "[PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello! This is the v2 of pkill_on_warn.
Changes from v1 and tricks for testing are described below.

Rationale
=========

Currently, the Linux kernel provides two types of reaction to kernel
warnings:
1. Do nothing (by default),
2. Call panic() if panic_on_warn is set. That's a very strong reaction,
so panic_on_warn is usually disabled on production systems.

>From a safety point of view, the Linux kernel misses a middle way of
handling kernel warnings:
- The kernel should stop the activity that provokes a warning,
- But the kernel should avoid complete denial of service.

>From a security point of view, kernel warning messages provide a lot of
useful information for attackers. Many GNU/Linux distributions allow
unprivileged users to read the kernel log, so attackers use kernel
warning infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Let's introduce the pkill_on_warn sysctl.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.

Moreover, bugs usually don't come alone, and a kernel warning may be
followed by memory corruption or other bad effects. So pkill_on_warn allows
the kernel to stop the process when the first signs of wrong behavior
are detected.

Changes from v1
===============

1) Introduce do_pkill_on_warn() and call it in all warning handling paths.

2) Do refactoring without functional changes in a separate patch.

3) Avoid killing init and kthreads.

4) Use do_send_sig_info() instead of do_group_exit().

5) Introduce sysctl instead of using core_param().

Tricks for testing
==================

1) This patch series was tested on x86_64 using CONFIG_LKDTM.
The kernel kills a process that performs this:
echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT

2) The warn_slowpath_fmt() path was tested using this trick:
diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 84b87538a15d..3106c203ebb6 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -73,7 +73,7 @@ do { \
* were to trigger, we'd rather wreck the machine in an attempt to get the
* message out than not know about it.
*/
-#define __WARN_FLAGS(flags) \
+#define ___WARN_FLAGS(flags) \
do { \
instrumentation_begin(); \
_BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags)); \

3) Testing pkill_on_warn with kthreads was done using this trick:
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bce848e50512..13c56f472681 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2133,6 +2133,8 @@ static int __noreturn rcu_gp_kthread(void *unused)
WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANUP);
rcu_gp_cleanup();
WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANED);
+
+ WARN_ONCE(1, "hello from kthread\n");
}
}

4) Changing drivers/misc/lkdtm/bugs.c:lkdtm_WARNING() allowed me
to test all warning flavours:
- WARN_ON()
- WARN()
- WARN_TAINT()
- WARN_ON_ONCE()
- WARN_ONCE()
- WARN_TAINT_ONCE()

Thanks!

Alexander Popov (2):
bug: do refactoring allowing to add a warning handling action
sysctl: introduce kernel.pkill_on_warn

Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++
include/asm-generic/bug.h | 37 +++++++++++++++------
include/linux/panic.h | 3 ++
kernel/panic.c | 22 +++++++++++-
kernel/sysctl.c | 9 +++++
lib/bug.c | 22 ++++++++----
6 files changed, 90 insertions(+), 17 deletions(-)

--
2.31.1

Next message: Alexander Popov: "[PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action"
Previous message: Chun-Kuang Hu: "Re: [PATCH v5 5/6] drm/mediatek: Add mbox_free_channel in mtk_drm_crtc_destroy"
Next in thread: Alexander Popov: "[PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]