Re: [RFC PATCH] kernel/module: add a safer implementation of try_module_get()

From: Luis Chamberlain
Date: Fri Feb 02 2024 - 13:31:59 EST


On Thu, Feb 01, 2024 at 03:27:54PM +0100, Marco Pagani wrote:
>
> On 2024-01-30 21:47, Luis Chamberlain wrote:
> >
> > It very much sounds like there is a desire to have this but without a
> > user, there is no justification.
>
> I was working on a set of patches to fix an issue in the fpga subsystem
> when I came across your commit 557aafac1153 ("kernel/module: add
> documentation for try_module_get()") that made me realize we also had a
> safety problem.
>
> To solve this problem for the fpga manager, we had to add a mutex to
> ensure the low-level module still exists before calling
> try_module_get(). However, having a safer version of try_module_get()
> would have simplified the code and made it more robust against changes.
>
> https://lore.kernel.org/linux-fpga/20240111160242.149265-1-marpagan@xxxxxxxxxx/
>
> I suspect there may be other cases where try_module_get() is
> inadvertently called without ensuring that the module still exists
> that may benefit from a safer implementation.

Maybe so, however I'm not yet sure if this is safe from deadlocks.
Please work on a series of selftest simple modules which demonstrate
its use / and a simple bash script selftest loader which verifies this
won't bust. Consider you may have third party modules which also race
with this too, and other users without this new API.

> >> +bool try_module_get_safe(struct module *module)
> >> +{
> >> + struct module *mod;
> >> + bool ret = true;
> >> +
> >> + if (!module)
> >> + goto out;
> >> +
> >> + mutex_lock(&module_mutex);
> >
> > If a user comes around then this should be mutex_lock_interruptible(),
> > and add might_sleep()
>
> Would it be okay to return false if it gets interrupted, or should I
> change the return type to int to propagate -EINTR? My concern with
> changing the signature is that it would be less straightforward to
> use the function in place of try_module_get().

Since we want a safe mechanism we might as well not allow a simple drop
in replacement but a more robust one so that users take care of the
return value properly.

Luis