Re: [PATCH 2/2] module: add support to avoid duplicates early on load

From: Linus Torvalds
Date: Thu May 25 2023 - 17:13:35 EST


On Thu, May 25, 2023 at 11:45 AM Lucas De Marchi
<lucas.demarchi@xxxxxxxxx> wrote:
>
> Are you willig to merge (a possibly improved version of) your patch
> or the userspace change is still something that would be desired?

I think a user space change should still be something that people
should look at, particularly as the kernel side patch I'm willing to
accept doesn't catch the "completely serial" cases, only the "trying
to load at the same time that the same module is literally busy being
loaded".

But I've cleaned up my patch a bit, and while the cleaned-up version
is rather larger as a patch (mainly because of just also re-organizing
the finit_module() code to do all the 'struct file' prep), I'm
actually pretty happy with this attached patch conceptually.

In this form, it actually "makes sense" to me, rather than being just
clearly a workaround. Also, unlike the previous patch, this doesn't
actually make any changes to the basic kernel_read_file() set of
functions, it's all done by the module loading code itself.

Luis, would you mind testing this version on your load? It still won't
actually handle the purely serial case, so there *will* be those
spurious double module reads from different CPU's just doing the
things serially, but the exclusive file access region has been
extended to not just cover the actual file content reading, but to
cover the whole "turn it into a a real module" part too.

Also, this does *not* update some of the comments in the module
loading. I changed finit_module to use "kernel_read_file()" instead of
"kernel_read_file_from_fd()", since it actually now has to look up the
file descriptor anyway. But the comments still talk about that
"from_fd" thing.

Anyway, this is back to "ENTIRELY UNTESTED" territory, in that I've
compiled this, but haven't booted it. The changes look obvious, but
hey, mistakes happen.

And the commit message is just a place-holder. Obviously. I won't sign
off on this or write more of a commit message until it has had some
real testing.

Linus
From e3223bfbdd5455f0b11337a591fad3a1816b9d09 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 25 May 2023 09:32:25 -0700
Subject: [PATCH] Attempt at workaround for horrible udev module loading
behavior

---
include/linux/fs.h | 6 ++++
kernel/module/main.c | 72 ++++++++++++++++++++++++++++++--------------
2 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..722b42a77d51 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2566,6 +2566,12 @@ static inline int deny_write_access(struct file *file)
struct inode *inode = file_inode(file);
return atomic_dec_unless_positive(&inode->i_writecount) ? 0 : -ETXTBSY;
}
+static inline int exclusive_deny_write_access(struct file *file)
+{
+ int old = 0;
+ struct inode *inode = file_inode(file);
+ return atomic_try_cmpxchg(&inode->i_writecount, &old, -1) ? 0 : -ETXTBSY;
+}
static inline void put_write_access(struct inode * inode)
{
atomic_dec(&inode->i_writecount);
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 044aa2c9e3cb..b4c7e925fdb0 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -3057,11 +3057,53 @@ SYSCALL_DEFINE3(init_module, void __user *, umod,
return load_module(&info, uargs, 0);
}

-SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
+static int file_init_module(struct file *file, const char __user * uargs, int flags)
{
struct load_info info = { };
void *buf = NULL;
int len;
+
+ len = kernel_read_file(file, 0, &buf, INT_MAX, NULL,
+ READING_MODULE);
+ if (len < 0) {
+ mod_stat_inc(&failed_kreads);
+ mod_stat_add_long(len, &invalid_kread_bytes);
+ return len;
+ }
+
+ if (flags & MODULE_INIT_COMPRESSED_FILE) {
+ int err = module_decompress(&info, buf, len);
+ vfree(buf); /* compressed data is no longer needed */
+ if (err) {
+ mod_stat_inc(&failed_decompress);
+ mod_stat_add_long(len, &invalid_decompress_bytes);
+ return err;
+ }
+ } else {
+ info.hdr = buf;
+ info.len = len;
+ }
+
+ return load_module(&info, uargs, flags);
+}
+
+/*
+ * kernel_read_file() will already deny write access, but module
+ * loading wants _exclusive_ access to the file, so we do that
+ * here, along with basic sanity checks.
+ */
+static int prepare_file_for_module_load(struct file *file)
+{
+ if (!file || !(file->f_mode & FMODE_READ))
+ return -EBADF;
+ if (!S_ISREG(file_inode(file)->i_mode))
+ return -EINVAL;
+ return exclusive_deny_write_access(file);
+}
+
+SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
+{
+ struct fd f;
int err;

err = may_init_module();
@@ -3075,28 +3117,14 @@ SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
|MODULE_INIT_COMPRESSED_FILE))
return -EINVAL;

- len = kernel_read_file_from_fd(fd, 0, &buf, INT_MAX, NULL,
- READING_MODULE);
- if (len < 0) {
- mod_stat_inc(&failed_kreads);
- mod_stat_add_long(len, &invalid_kread_bytes);
- return len;
+ f = fdget(fd);
+ err = prepare_file_for_module_load(f.file);
+ if (!err) {
+ err = file_init_module(f.file, uargs, flags);
+ allow_write_access(f.file);
}
-
- if (flags & MODULE_INIT_COMPRESSED_FILE) {
- err = module_decompress(&info, buf, len);
- vfree(buf); /* compressed data is no longer needed */
- if (err) {
- mod_stat_inc(&failed_decompress);
- mod_stat_add_long(len, &invalid_decompress_bytes);
- return err;
- }
- } else {
- info.hdr = buf;
- info.len = len;
- }
-
- return load_module(&info, uargs, flags);
+ fdput(f);
+ return err;
}

/* Keep in sync with MODULE_FLAGS_BUF_SIZE !!! */
--
2.41.0.rc1.4.gb4dae75050