Re: [PATCH] char: misc: make misc_open() and misc_register() killable
From: Oliver Neukum
Date: Wed Jul 06 2022 - 08:17:47 EST
On 06.07.22 12:26, Tetsuo Handa wrote:
> wait_for_device_probe() in snapshot_open() was added by commit c751085943362143
> ("PM/Hibernate: Wait for SCSI devices scan to complete during resume"), and
> that commit did not take into account possibility of unresponsive hardware.
>
> "In addition, if the resume from hibernation is userland-driven, it's
> better to wait for all device probes in the kernel to complete before
> attempting to open the resume device."
>
>
Testsuo-san,
I am afraid my first reply was too court to be useful. Sorry for that.
First let me congratulate you for finding and analyzing an important
issue.
Yet, I am afraid while your analysis is good, your attempt at a fix
suffers from being too close to the analysis, instead of taking a step
back and looking at root causes.
Frankly I was afraid you'd look at UAS next and try to fix it in the
same way. And that is the core of the issue. IF the SCSI layer can be
made to hang a host controller by an unresponsive device, the issue
is in the SCSI layer. If you were to insist on your current approach
you'd have to go through every host controller driver. You are just
seeing this only with storage because you are fuzzing USB, not SCSI.
But the bug you found is more fundamental than a single bus system.
The SCSI layer is just designed in such a way that timeouts are handled
by the core. That is a fundamental design decision you cannot easily
deviate from. Hence I would like to ask you to take a closer look
at the scanning code in the SCSI layer, not a host controller driver.
Regards
Oliver