Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500

From: James Bottomley
Date: Sun Feb 10 2019 - 11:25:35 EST


On Sun, 2019-02-10 at 09:05 -0700, Jens Axboe wrote:
> On 2/10/19 8:44 AM, James Bottomley wrote:
> > On Sun, 2019-02-10 at 10:17 +0100, Mikael Pettersson wrote:
> > > On Sat, Feb 9, 2019 at 7:19 PM James Bottomley
> > > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > [...]
> > > > I think the reason for this is that the block mq path doesn't
> > > > feed
> > > > the kernel entropy pool correctly, hence the need to install an
> > > > entropy gatherer for systems that don't have other good random
> > > > number sources.
> > >
> > > That does sound plausible, I admit I didn't even consider the
> > > possibility that the old block I/O path also was an entropy
> > > source.
> >
> > In theory, the new one should be as well since the rotational
> > entropy
> > collector is on the SCSI completion path. I'd seen the same
> > problem
> > but had assumed it was something someone had done to our internal
> > entropy pool and thus hadn't bisected it.
>
> The difference is that the old stack included ADD_RANDOM by default,
> so this check:
>
> if (blk_queue_add_random(q))
> add_disk_randomness(req->rq_disk);
>
> in scsi_end_request() would be true, and we'd add the randomness. For
> sd, it seems to set it just fine for non-rotational drives. Could
> this be because other devices don't? Maybe the below makes a
> difference.

No, in both we set it per the rotational parameters of the disk in

sd.c:sd_read_block_characteristics()

rot = get_unaligned_be16(&buffer[4]);

if (rot == 1) {

blk_queue_flag_set(QUEUE_FLAG_NONROT, q);

blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
} else {

blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);

blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
}


That check wasn't changed by the code removal.

Although I suspect it should be unconditional: even SSDs have what
would appear as seek latencies at least during writes depending on the
time taken to find an erased block or even trigger garbage collection.
The entropy collector is good at taking something completely regular
and spotting the inconsistencies, so it won't matter that loads of
"seeks" are deterministic.

James