[PATCH] core_pattern: add CPU specifier

From: Oleksandr Natalenko
Date: Sat Sep 03 2022 - 02:43:43 EST


Statistically, in a large deployment regular segfaults may indicate a CPU issue.

Currently, it is not possible to find out what CPU the segfault happened on.
There are at least two attempts to improve segfault logging with this regard,
but they do not help in case the logs rotate.

Hence, lets make sure it is possible to permanently record a CPU
the task ran on using a new core_pattern specifier.

Suggested-by: Renaud Métrich <rmetrich@xxxxxxxxxx>
Signed-off-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxx>
---
Documentation/admin-guide/sysctl/kernel.rst | 1 +
fs/coredump.c | 5 +++++
include/linux/coredump.h | 1 +
3 files changed, 7 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 835c8844bba48..b566fff04946b 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -169,6 +169,7 @@ core_pattern
%f executable filename
%E executable path
%c maximum size of core file by resource limit RLIMIT_CORE
+ %C CPU the task ran on
%<OTHER> both are dropped
======== ==========================================

diff --git a/fs/coredump.c b/fs/coredump.c
index a8661874ac5b6..166d1f84a9b17 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -325,6 +325,10 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
err = cn_printf(cn, "%lu",
rlimit(RLIMIT_CORE));
break;
+ /* CPU the task ran on */
+ case 'C':
+ err = cn_printf(cn, "%d", cprm->cpu);
+ break;
default:
break;
}
@@ -535,6 +539,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
*/
.mm_flags = mm->flags,
.vma_meta = NULL,
+ .cpu = raw_smp_processor_id(),
};

audit_core_dumps(siginfo->si_signo);
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
index 08a1d3e7e46d0..191dcf5af6cb9 100644
--- a/include/linux/coredump.h
+++ b/include/linux/coredump.h
@@ -22,6 +22,7 @@ struct coredump_params {
struct file *file;
unsigned long limit;
unsigned long mm_flags;
+ int cpu;
loff_t written;
loff_t pos;
loff_t to_skip;
--
2.37.2