Re: [PATCH] module: print module name on refcount error
From: Luis Chamberlain
Date: Fri Jun 30 2023 - 19:05:43 EST
On Mon, Jun 26, 2023 at 12:32:52PM +0200, Jean Delvare wrote:
> If module_put() triggers a refcount error, include the culprit
> module name in the warning message, to easy further investigation of
> the issue.
>
> Signed-off-by: Jean Delvare <jdelvare@xxxxxxx>
> Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> ---
> kernel/module/main.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> --- linux-6.3.orig/kernel/module/main.c
> +++ linux-6.3/kernel/module/main.c
> @@ -850,7 +850,9 @@ void module_put(struct module *module)
> if (module) {
> preempt_disable();
> ret = atomic_dec_if_positive(&module->refcnt);
> - WARN_ON(ret < 0); /* Failed to put refcount */
> + WARN(ret < 0,
> + KERN_WARNING "Failed to put refcount for module %s\n",
> + module->name);
> trace_module_put(module, _RET_IP_);
> preempt_enable();
> }
>
The mod struct ends up actually being allocated, we first read the ELF
passed by userspace and we end up allocating space for struct module
when reading the ELF section ".gnu.linkonce.this_module". We cache
the ELF section index in info->index.mod, we finally copy the module
into the allocated space with move_module().
In linux-next code this is much more clear now.
What prevents us from racing to free the module and thus invalidating
the name?
For instance the system call to delete_module() could hammer and
so have tons of threads racing try_stop_module(), eventually one of
them could win and free_module() would kick in gear.
What prevents code from racing the free with a random module_put()
called by some other piece of code?
I realize this may implicate even the existing code seems racy.
Luis