Re: [PATCH v3 1/2] HID: bigben: use spinlock to safely schedule workers
From: Benjamin Tissoires
Date: Mon Feb 13 2023 - 03:26:34 EST
On Sat, Feb 11, 2023 at 3:01 AM Hillf Danton <hdanton@xxxxxxxx> wrote:
>
> On Fri, 10 Feb 2023 15:11:26 +0100 Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx>
> > On Fri, Feb 10, 2023 at 2:24 PM Hillf Danton <hdanton@xxxxxxxx> wrote:
> > > On Thu, 09 Feb 2023 23:58:55 +0000 Pietro Borrello <borrello@xxxxxxxxxxxxxxxx>
> > > > Use spinlocks to deal with workers introducing a wrapper
> > > > bigben_schedule_work(), and several spinlock checks.
> > > > Otherwise, bigben_set_led() may schedule bigben->worker after the
> > > > structure has been freed, causing a use-after-free.
> > > >
> > > > Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal")
> > >
> > > Given the flag added in 4eb1b01de5b9 and the spinlock added in this
> > > patchset, devm_led_classdev_register() looks to not work for you.
> >
> > Actually, looking at the code now, it is clear that we need that lock.
> > The current code is happily changing the struct bigben_device from
> > multiple contexts, and pulls that without any barrier in the work
> > struct which should produce some interesting results :)
> >
> > And we can probably abuse that lock to prevent scheduling a new work
> > as it is done in hid-playstation.c
> >
> > I'll comment in the patch which parts need to be changed, because it
> > is true that this patch is definitely not mergeable as such and will
> > need another revision.
> >
> > >
> > > How about replacing the advanced devm_ method with the traditional plain
> > > pair of led_classdev_un/register(), with the flag mentioned cut off but
> > > without bothering to add another lock?
> > >
> >
> > As mentioned above, the lock is needed anyway, and will probably need
> > to be added in a separate patch.
> > Reverting to a non devm version of the led class would complexify the
> > driver for the error paths, and is probably not the best move IMO.
>
> After this patch,
>
> cpu 0 cpu 2
> --- ---
> bigben_remove()
> spin_lock_irqsave(&bigben->lock, flags);
> bigben->removed = true;
> spin_unlock_irqrestore(&bigben->lock, flags);
>
> spin_lock_irqsave(&bigben->lock, flags);
>
> what makes it safe for cpu2 to acquire lock after the removed flag is true?
The remove flag is just a way to prevent any other workqueue from
starting. It doesn't mean that the struct bigben has been freed, so
acquiring a lock at that point is fine.
We then rely on 2 things:
- devm_class_led to be released before struct bigben, because it was
devm-allocated *after* the struct bigben was devm-allocated
- we prevent any new workqueue to start and we guarantee that any
running workqueue is terminated before leaving the .remove function.
Given that the ledclass is gracefully shutting down all of its
potential queues, we don't have any other possibility to have an
unsafe call AFAIU.
Cheers,
Benjamin