Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

From: hch@xxxxxx
Date: Thu Feb 09 2017 - 08:34:11 EST


Hi Dexuan,

I've spent some time with the logs and looking over the code and
couldn't find any smoking gun. I start to wonder if it might just
be a timing issue?

Can you try one or two things for me:

1) run with the blk-mq I/O path for scsi by either enabling it a boot /
module load time with the scsi_mod.use_blk_mq=Y option, or at compile
time by enabling the CONFIG_SCSI_MQ_DEFAULT option. If that fails
with the commit a blk-mq run before the commit would also be useful.
2) if possible run a VM config without the virtual CD-ROM drive -
a lot of the scsi log chatter is about handling timeouts on the
CD drive, so that might be able to isolate issues a bit better.

Note that I'll be offline from this afternoon European time until Sunday
night as I'm out in the mountains at a lodge without internet access,
but this issue will be my priority once back.