Re: problem booting 5.10

From: John Garry
Date: Tue Dec 08 2020 - 16:15:48 EST


On 08/12/2020 19:19, Linus Torvalds wrote:
On Tue, Dec 8, 2020 at 10:59 AM Martin K. Petersen
<martin.petersen@xxxxxxxxxx> wrote:

So I'm adding SCSI people to the cc, just in case they go "Hmm..".

Only change in this department was:

831e3405c2a3 scsi: core: Don't start concurrent async scan on same host

Yeah, I found that one too, and dismissed it for the same reason you
did - it wasn't in rc1. Plus it looked very simple.

That said, maybe Julia might have misspoken, and rc1 was ok, so I
guess it's possible. The scan_mutex does show up in that "locks held"
list, although I can't see why it would matter. But it does
potentially change timing (so it could expose some existing race), if
nothing else.

But let's make sure Jens is aware of this too, in case it's some ATA
issue. Not that any of those handful of 5.10 changes look remotely
likely _either_.

Jens, see

https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2012081813310.2680@hadrien/

if you don't already have the lkml thread locally.. There's not enough
of the dmesg to even really guess what Julia's actual hardware is,
apart from it being a Seagate SATA disk. Julia? What controllers and
disks do you have show up when things work?

Linus
.


JFYI, About "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug", we did have an issue reported here already from Qian about a boot hang:

https://lore.kernel.org/linux-scsi/fe3dff7dae4494e5a88caffbb4d877bbf472dceb.camel@xxxxxxxxxx/

And the solution to that specific problem is in:
https://lore.kernel.org/linux-block/20201203012638.543321-1-ming.lei@xxxxxxxxxx/

This issue may be related, so you could test by reverting that megaraid sas commit or setting the driver module param "host_tagset_enable=0" just to see.

Thanks,
John