Re: [BUG] ALSA: core: possible deadlock involving waiting and locking operations

From: Takashi Iwai
Date: Sat Jan 29 2022 - 03:20:29 EST


On Sat, 29 Jan 2022 09:07:05 +0100,
Jia-Ju Bai wrote:
>
>
>
> On 2022/1/29 12:27, Takashi Sakamoto wrote:
> > Hi,
> >
> > On Sat, Jan 29, 2022 at 11:33:26AM +0800, Jia-Ju Bai wrote:
> >> Hello,
> >>
> >> My static analysis tool reports a possible deadlock in the sound driver
> >> in Linux 5.10:
> >>
> >> snd_card_disconnect_sync()
> >>   spin_lock_irq(&card->files_lock); --> Line 461 (Lock A)
> >>   wait_event_lock_irq(card->remove_sleep, ...); --> Line 462 (Wait X)
> >>   spin_unlock_irq(&card->files_lock); --> Line 465 (Unlock A)
> >>
> >> snd_hwdep_release()
> >>   mutex_lock(&hw->open_mutex); --> Line 152 (Lock B)
> >>   mutex_unlock(&hw->open_mutex); --> Line 157 (Unlock B)
> >>   snd_card_file_remove()
> >>     wake_up_all(&card->remove_sleep); --> Line 976 (Wake X)
> >>
> >> snd_hwdep_open()
> >>   mutex_lock(&hw->open_mutex); --> Line 95 (Lock B)
> >>   snd_card_file_add()
> >>     spin_lock(&card->files_lock); --> Line 932 (Lock A)
> >>     spin_unlock(&card->files_lock); --> Line 940 (Unlock A)
> >>   mutex_unlock(&hw->open_mutex); --> Line 139 (Unlock B)
> >>
> >> When snd_card_disconnect_sync() is executed, "Wait X" is performed by
> >> holding "Lock A". If snd_hwdep_open() is executed at this time, it holds
> >> "Lock B" and then waits for acquiring "Lock A". If snd_hwdep_release()
> >> is executed at this time, it waits for acquiring "Lock B", and thus
> >> "Wake X" cannot be performed to wake up "Wait X" in
> >> snd_card_disconnect_sync(), causing a possible deadlock.
> >>
> >> I am not quite sure whether this possible problem is real and how to fix
> >> it if it is real.
> >> Any feedback would be appreciated, thanks :)
> > I'm interested in your report about the deadlock, and seek the cause
> > of issue. Then I realized that we should take care of the replacement of
> > file_operation before acquiring spinlock in snd_card_disconnect_sync().
> >
> > ```
> > snd_card_disconnect_sync()
> > ->snd_card_disconnect()
> > ->spin_lock()
> > ->list_for_each_entry()
> > mfile->file->f_op = snd_shutdown_f_ops
> > ->spin_unlock()
> > ->spin_lock_irq()
> > ->wait_event_lock_irq()
> > ->spin_unlock_irq()
> > ```
> >
> > The implementation of snd_shutdown_f_ops has no value for .open, therefore
> > snd_hwdep_open() is not called anymore when waiting the event. The mutex
> > (Lock B) is not acquired in process context of ALSA hwdep application.
> >
> > The original .release function can be called by snd_disconnect_release()
> > via replaced snd_shutdown_f_ops. In the case, as you can see, the spinlock
> > (Lock A) is not acquired.
> >
> > I think there are no race conditions against Lock A and B in process
> > context of ALSA hwdep application after card disconnection. But it would
> > be probable to overlook the other case. I would be glad to receive your
> > check for the above procedure.
>
> Thanks a lot for the quick reply :)
> Your explanation is reasonable, because snd_shutdown_f_ops indeed has
> no value for .open.
>
> However, my static analysis tool finds another possible deadlock in
> the mentioned code:
>
> snd_card_disconnect_sync()
>   spin_lock_irq(&card->files_lock); --> Line 461 (Lock A)
>   wait_event_lock_irq(card->remove_sleep, ...); --> Line 462 (Wait X)
>   spin_unlock_irq(&card->files_lock); --> Line 465 (Unlock A)
>
> snd_hwdep_release()
>   snd_card_file_remove()
>     spin_lock(&card->files_lock); --> Line 962 (Lock A)
>     wake_up_all(&card->remove_sleep); --> Line 976 (Wake X)
>     spin_unlock(&card->files_lock); --> Line 977 (Unlock A)
>
> When snd_card_disconnect_sync() is executed, "Wait X" is performed by
> holding "Lock A".

No, it's wait_event_lock_irq(), and this helper unlocks the given lock
during waiting and re-locks it after schedule(). See the macro
expansion in include/linux/wait.h.


Takashi