[PATCH 0/2] acct: don't allow access to internal filesystems

From: Christian Brauner
Date: Tue Feb 11 2025 - 12:16:45 EST


In [1] it was reported that the acct(2) system call can be used to
trigger a NULL deref in cases where it is set to write to a file that
triggers an internal lookup.

This can e.g., happen when pointing acct(2) to /sys/power/resume. At the
point the where the write to this file happens the calling task has
already exited and called exit_fs() but an internal lookup might be
triggered through lookup_bdev(). This may trigger a NULL-deref
when accessing current->fs.

This series does two things:

- Reorganize the code so that the the final write happens from the
workqueue but with the caller's credentials. This preserves the
(strange) permission model and has almost no regression risk.

- Block access to kernel internal filesystems as well as procfs and
sysfs in the first place.

This api should stop to exist imho.

Link: https://lore.kernel.org/r/20250127091811.3183623-1-quzicheng@xxxxxxxxxx [1]

Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
---
Christian Brauner (2):
acct: perform last write from workqueue
acct: block access to kernel internal filesystems

kernel/acct.c | 134 ++++++++++++++++++++++++++++++++++++----------------------
1 file changed, 84 insertions(+), 50 deletions(-)
---
base-commit: af69e27b3c8240f7889b6c457d71084458984d8e
change-id: 20250211-work-acct-a6d8e92a5fe0