Re: [RFC PATCH] fpga: remove module reference counting from core components
From: Xu Yilun
Date: Sat Nov 18 2023 - 07:01:08 EST
On Fri, Nov 17, 2023 at 05:06:16PM -0500, Greg Kroah-Hartman wrote:
> On Fri, Nov 17, 2023 at 10:58:59PM +0100, Marco Pagani wrote:
> >
> >
> > On 2023-11-14 07:53, Xu Yilun wrote:
> > > On Fri, Nov 10, 2023 at 11:58:37PM +0100, Marco Pagani wrote:
> > >>
> > >>
> > >> On 2023-11-08 16:52, Xu Yilun wrote:
> > >>> On Fri, Nov 03, 2023 at 09:31:02PM +0100, Marco Pagani wrote:
> > >>>>
> > >>>>
> > >>>> On 2023-10-30 09:32, Xu Yilun wrote:
> > >>>>> On Fri, Oct 27, 2023 at 05:29:27PM +0200, Marco Pagani wrote:
> > >>>>>> Remove unnecessary module reference counting from the core components
> > >>>>>> of the subsystem. Low-level driver modules cannot be removed before
> > >>>>>> core modules since they use their exported symbols.
> > >>>>>
> > >>>>> Could you help show the code for this conclusion?
> > >>>>>
> > >>>>> This is different from what I remember, a module cannot be removed when
> > >>>>> its exported symbols are being used by other modules. IOW, the core
> > >>>>> modules cannot be removed when there exist related low-level driver
> > >>>>> modules. But the low-level driver modules could be removed freely
> > >>>>> without other protecting mechanism.
> > >>>>>
> > >>>>
> > >>>> My understanding was that we wanted to remove module reference counting
> > >>>> from the fpga core and ease it from the responsibility of preventing
> > >>>> low-level driver modules from being unloaded.
> > >>>
> > >>> FPGA core needs to prevent low-level driver module unloading sometimes,
> > >>> e.g. when region reprograming is in progress. That's why we get fpga
> > >>> region driver modules & bridge modules in fpga_region_program_fpga().
> > >>>
> > >>> But we try best to get them only necessary. Blindly geting them all the
> > >>> time results in no way to unload all modules (core & low level modules).
> > >>>
> > >>>>
> > >>>> If we want to keep reference counting in the fpga core, we could add a
> > >>>> struct module *owner field in the struct fpga_manager_ops (and others
> > >>>> core *_ops) so that the low-level driver can set it to THIS_MODULE.
> > >>>> In this way, we can later use it in fpga_mgr_register() to bump up the
> > >>>
> > >>> Yes, we should pass the module owner in fpga_mgr_register(), but could
> > >>> not bump up its refcount at once.
> > >>>
> > >>>> refcount of the low-level driver module by calling
> > >>>> try_module_get(mgr->mops->owner) directly when it registers the manager.
> > >>>> Finally, fpga_mgr_unregister() would call module_put(mgr->mops->owner)
> > >>>> to allow unloading the low-level driver module.
> > >>>
> > >>> As mentioned above, that makes problem. Most of the low level driver
> > >>> modules call fpga_mgr_unregister() on module_exit(), but bumping up
> > >>> their module refcount prevents module_exit() been executed. That came
> > >>> out to be a dead lock.
> > >>>
> > >>
> > >> Initially, I considered calling try_module_get(mgr->mops->owner)
> > >> in fpga_mgr_get(). But then, the new kernel-doc description of
> > >> try_module_get() (1) made me question the safety of that approach.
> > >> My concern is that the low-level driver could be removed right when
> > >> someone is calling fpga_mgr_get() and hasn't yet reached
> > >> try_module_get(mgr->mops->owner). In that case, the struct mops
> > >> (along with the entire low-level driver module) and the manager dev
> > >> would "disappear" under the feet of fpga_mgr_get().
> > >
> > > I don't get what's the problem. fpga_mgr_get() would first of all
> > > look for mgr_dev via class_find_device(), if low-level module is
> > > unloaded, then you cannot find the mgr_dev and gracefully error out.
> > >
> > > If class_find_device() succeed, mgr_dev got a reference and won't
> > > disappear. Finally we may still found module removed when
> > > try_module_get(), but should be another graceful error out.
> > >
> > > Am I missing anything?
> > >
> >
> > My concern is: suppose that you successfully got the mgr dev from
> > class_find_device(), and now you are in __fpga_mgr_get(), right before
> > try_module_get(mgr->mops->owner). At that point, you get descheduled,
> > and while you are not running, someone unloads the low-level driver
> > module that ends its life by calling fpga_mgr_unregister(). When you
> > wake up, you find yourself with a reference to a device that does not
> > exist anymore, trying to get a module that does not exist anymore
I may get the problem. The mgr device is still exists, but the module
is removed, so the mgr->mops & owner pointers are invalid..
> > through one of its symbols (module *owner in mops).
>
> Then the user gets to keep the multiple pieces that their kernel is now
> in :)
>
> Seriously, as module unload can never happen except by explicit ask,
> this should only possibly be an issue that a developer who is working on
> the code would hit, so don't work too hard to resolve something that
> isn't anything an actual user can hit.
>
> > Greg suggested checking if this can really happen and eventually
> > protecting fpga_mgr_get() and fpga_mgr_unregister() with a lock for
> > mops (if I understood correctly). In that case, considering the same
> > scenario described above:
> >
> > fpga_mgr_get() gets the mops lock and the mgr dev but is suspended
> > before calling try_module_get().
> >
> > Someone unloads the low-level driver, delete_modules progresses
> > (the module's recount hasn't yet been incremented) but blocks while
> > calling fpga_mgr_unregister() since fpga_mgr_get() is holding the lock.
> >
> > fpga_mgr_get() resumes and tries to get the module through one of its
> > symbols (mgr->mops->owner). The module's memory hasn't yet been freed
> > (delete_modules is blocked), and the refcount is zero, so
> > try_module_get() fails safely, and we can put the mgr dev that is
> > still present since fpga_mgr_unregister() is blocked.
> >
> > fpga_mgr_unregister() resumes and unregisters the mgr dev.
>
> That seems a bit reasonable, try it and see!
It also looks good to me.
Thanks,
Yilun
>
> > I'm still thinking about the possible implications. On the one hand,
> > it looks safe in this case, but on the other hand, it feels brittle.
> > In my understanding, the root problem is that there will always be a
> > critical window (when you have taken a reference to the device but
> > not yet to the low-level driver module) when unloading the module
> > could be potentially unsafe depending on the current implementation
> > and the preemption model.
> >
> > I still feel that it would be simpler and safer if we could bump
> > up the refcount during fpga_mgr_register() and maybe have a sysfs
> > attribute to unlock the low-level driver (if no one has taken the
> > mgr dev refcount).
>
> Ick, no, that shouldn't be needed.
>
> > That way, it would be safer by design since the
> > refcount will be bumped up right during the module load procedure,
> > and we could guarantee that the lifetime of the mgr device is
> > entirely contained in the lifetime of the low-level driver module.
>
> Remember, there are two different things here, code and data. Trying to
> tie one ref count to the other is almost always going to cause problems,
> try to keep them independent if at all possible.
>
> Or better yet, just don't use module reference counts at all and
> properly drop the device when the specific module is unloaded, like
> network drivers do. That might take more work to restructure things,
> which might be useless work given that again, this is something that no
> user will ever hit, only developers if at all.
>
> thanks,
>
> greg k-h