Re: [REPORT] syscall reboot + umh + firmware fallback

From: Byungchul Park
Date: Thu May 12 2022 - 07:21:58 EST


Tejun wrote:
> Hello,

Hello,

> I'm not sure I'm reading it correctly but it looks like "process B" column

I think you're interpreting the report correctly.

> is superflous given that it's waiting on the same lock to do the same thing
> that A is already doing (besides, you can't really halt the machine twice).

Indeed! I've been in a daze. I thought kernel_halt() can be called twice
by two different purposes. Sorry for the noise.

> What it's reporting seems to be ABBA deadlock between A waiting on
> umhelper_sem and C waiting on fw_st->completion. The report seems spurious:
>
> 1. wait_for_completion_killable_timeout() doesn't need someone to wake it up
> to make forward progress because it will unstick itself after timeout
> expires.

I have a question about this one. Yes, it would never been stuck thanks
to timeout. However, IIUC, timeouts are not supposed to expire in normal
cases. So I thought a timeout expiration means not a normal case so need
to inform it in terms of dependency so as to prevent further expiraton.
That's why I have been trying to track even timeout'ed APIs.

Do you think DEPT shouldn't track timeout APIs? If I was wrong, I
shouldn't track the timeout APIs any more.

> 2. complete_all() from __fw_load_abort() isn't the only source of wakeup.
> The fw loader can be, and mainly should be, woken up by firmware loading
> actually completing instead of being aborted.

This is the point I'd like to ask. In normal cases, fw_load_done() might
happen, of course, if the loading gets completed. However, I was
wondering if the kernel ensures either fw_load_done() or fw_load_abort()
to be called by *another* context while kernel_halt().

> Thanks.

Thank you very much!

Byungchul

>
> --
> tejun
>