Re: [PATCH] block: BFQ default for single queue devices

From: Paolo Valente
Date: Wed Oct 03 2018 - 02:29:20 EST




> Il giorno 02 ott 2018, alle ore 16:31, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
>
> On 10/2/18 6:43 AM, Linus Walleij wrote:
>> This sets BFQ as the default scheduler for single queue
>> block devices (nr_hw_queues == 1) if it is available. This
>> affects notably MMC/SD-cards but notably also UBI and
>> the loopback device.
>>
>> I have been running it for a while without any negative
>> effects on my pet systems and I want some wider testing
>> so let's throw it out there and see what people say.
>> Admittedly my use cases are limited.
>>
>> I talked to Pavel a bit back and it turns out he has a
>> usecase for BFQ as well and I bet he also would like it
>> as default scheduler for that system (Pavel tell us more,
>> I don't remember what it was!)
>>
>> Intuitively I could understand that maybe we want to
>> leave the loop device (possibly others? nbd? rbd?) as
>> "none", as it is probably relying on a scheduler on the
>> device below it, so I'm open to passing in a scheduler hint
>> from the respective subsystem in say struct blk_mq_tag_set.
>> However that makes for a bit of syntactic dissonance
>> with the struct member ".nr_hw_queues" (I wonder how
>> the loop device can have 1 "hardware queue"?) so
>> maybe we should in that case also rename that struct
>> member to ".nr_queues" fair and square before we start
>> making adjustments for treating queues differently whether
>> they are in hardware or actually not.
>
> I think this should just be done with udev rules, and I'd
> prefer if the distros would lead the way on this, as they
> are the ones that will most likely see the most bug reports
> on a change like this.
>

Hi Jens,
I see your point, but I doubt this is the way to go, because of the
following flaws.

As also Linus Torvalds complained [1], people feel lost among
I/O-scheduler options. Actual differences across I/O schedulers are
basically obscure to non experts. In this respect, Linux-kernel
'users' are way more than a few top-level distros that can afford a
strong performance team, and that, basing on the input of such a team,
might venture light-heartedly to change a critical component like an
I/O scheduler. Plus, as Linus Walleij pointed out, some users simply
are not distros that use udev.

So, probably 99% of Linux-kernel users will just stick to the default
I/O scheduler, mq-deadline, assuming that the algorithm by which that
scheduler was chosen was not "pick the scheduler with the longest
name", but "pick the best scheduler for most cases". The problem is
that, for single-queue devices with a speed below 400/500 KIOPS, the
default scheduler is apparently incomparably worse than bfq in terms
of responsiveness and latency for time-sensitive applications [2], and
in terms of throughput reached while controlling I/O [3]. And, in all
other tests ran so far, by any entity or group I'm aware of, bfq
results basically on par with or better than mq-deadline.

So, I do understand your need for conservativeness, but, after so much
evidence on single-queue devices, and so many years! :), what's the
point in keeping Linux worse for virtually everybody, by default?

Thanks,
Paolo

[1] https://lkml.org/lkml/2017/2/21/791
[2] http://algo.ing.unimo.it/people/paolo/disk_sched/results.php
[3] https://lwn.net/Articles/763603/



> --
> Jens Axboe