Re: [PATCH] sysrq: add optional logging of caller info on /proc/sysrq-trigger write
From: Xiang Gao
Date: Mon Apr 20 2026 - 08:11:11 EST
On Sat, Apr 18, 2026 at 09:20:14AM +0200, Greg Kroah-Hartman wrote:
> We really do not want or like #ifdef in .c files, and for stuff like
> this, it is not needed at all.
[...]
> Module parameters are really not the way for stuff like this. And why
> would such a "tiny" option need a config option at all? If you don't
> use/need it, it's only a single bool being used?
[...]
> n is always the default, no need to add it again.
Agreed on all of these. The #ifdef, the module_param, the Kconfig
entry and the 'default n' all go away -- the feature will be
unconditional.
Before respinning I would like to align on the approach, because
your comments on the kernel log and the ancestry walk touch the
core of what this patch should actually be. Let me describe the
real use case first, since I do not think I made it clear in v1.
Use case
- Userspace triggers are typically several processes deep:
some service -> library -> system("echo c > /proc/sysrq-trigger")
-> sh. The immediate writer is "sh", the actually responsible
component is several levels up.
- The sysrq character we care about is almost always 'c', 'b' or
'o' -- the machine panics, reboots, or powers off as a direct
result of the write.
- *After* the device has come back up, we need to identify which
userspace component issued the write. Otherwise there is no way
to debug why devices in the field are rebooting themselves.
> The kernel log is not there for doing audits and the like, so is this
> just a debug option?
I agree that security audit belongs in the audit subsystem, not in
printk. What this patch is trying to do is post-mortem diagnostic
for crash/reboot triage, and audit_log() cannot deliver on that:
1. audit_log() is asynchronous -- the record is queued, then
delivered via kauditd -> netlink to a userspace subscriber.
For sysrq c/b/o the panic/reboot happens before kauditd gets
to run, so the record is never delivered.
2. In our deployment environment there is no userspace auditd
persisting records to disk, and audit records are not
preserved across reboot. The kernel ring buffer, via pstore /
ramoops, is the only channel whose contents survive the
reboot and are readable on the next boot. pr_info() is the
only mechanism that delivers into that persistent path.
I would frame this not as audit but as post-mortem diagnostic,
retitle the commit accordingly ("sysrq: log triggering task info
for post-mortem diagnostics") and drop the audit framing from the
changelog entirely.
> This might cause problems for when the system is hung and sysrq is the
> only way to reboot the box. Have you tried it in that situation?
Fair -- the while-loop walking 5 levels has to go. The reason it
was there in v1 is the "sh from system()/popen()" case above: the
writer's comm is "sh", and the information we actually want is one
or more levels up the parent chain.
On the hung-system concern specifically: /proc/sysrq-trigger only
runs when userspace is alive enough to complete a write(), so the
keyboard sysrq path via handle_sysrq() is not affected by this
change -- my patch only touches write_sysrq_trigger(). Within that
function, rcu_read_lock() / rcu_dereference() take no locks,
allocate nothing, and are bounded-constant work, so on paper they
should not introduce a new hang. I have not yet reproduced a
partially-hung system to verify this empirically; I will do so
before v2 and report back. Please correct me if I am misreading
the concern.
That said, two options I can see, would like your steer before I
respin:
(a) One bounded rcu_dereference() of current->real_parent.
Logs current's pid/comm plus ppid + parent comm. One
level, no loop. Covers the direct case
app -> system("echo c > /proc/sysrq-trigger") -> sh
where sh's parent is the app.
(b) Two unrolled rcu_dereference()s -- current->real_parent
and that task's real_parent. Still no loop, just two
fixed dereferences. Covers the common indirect case
service -> library -> system(...) -> sh
where sh's parent is the thin library wrapper and the
grandparent is the originating service, which is what we
actually need to identify in the field.
In both cases the RCU read is a fixed number of dereferences,
not iteration over a data structure, so the total work is
bounded and constant.
I lean toward (b): one level is not enough to reach the
originating service in the indirect pattern this patch was
written for. If two dereferences in the sysrq path are still
too much, I will fall back to (a).
Which direction would you accept? I will respin based on your
answer.
Thanks,
Xiang