Re: [PATCH] block: BFQ default for single queue devices
From: Damien Le Moal
Date: Wed Oct 03 2018 - 03:42:29 EST
On 2018/10/03 16:18, Linus Walleij wrote:
> On Wed, Oct 3, 2018 at 9:05 AM Artem Bityutskiy <dedekind1@xxxxxxxxx> wrote:
>> On Wed, 2018-10-03 at 08:29 +0200, Paolo Valente wrote:
>>> So, I do understand your need for conservativeness, but, after so much
>>> evidence on single-queue devices, and so many years! :), what's the
>>> point in keeping Linux worse for virtually everybody, by default?
>>
>> Sounds like what we just need a mechanism for the device (ubi block in
>> this case) to select the I/O scheduler. I doubt enhancing the default
>> scheduler selection logic in 'elevator.c' is the right answer. Just
>> give the driver authority to override the defaults.
>
> This might be true in the wider sense (like for what scheduler to
> select for an NVME device with N channels) but $SUBJECT is just
> trying to select BFQ (if available) for devices with one and only one
> hardware queue.
>
> That is AFAICT the only reasonable choice for anything with just
> one hardware queue as things stand right now.
>
> I have a slight reservation for the weird outliers like loopdev, which
> has "one hardware queue" (.nr_hw_queues == 1) though this
> makes no sense at all. So I would like to know what people think
> about that. Maybe we should have .nr_queues and .nr_hw_queues
> where the former is the number of logical queues and the latter
> the actual number of hardware queues.
There is another class of outliers: host-managed SMR disks (SATA and SCSI,
definitely single hw queue). For these, using mq-deadline is mandatory in many
cases in order to guarantee sequential write command delivery to the device
driver. Having the default changed to bfq, which as far as I know is not SMR
friendly (can sequential writes within a single zone be reordered ?) is asking
for troubles (unaligned write errors showing up).
A while back, we already had this discussion with Jens and Christoph on the list
to allow device drivers to set a sensible default I/O scheduler for devices with
"special needs" (e.g. host-managed SMR). At the time, the conclusion was that
udev (or something alike in userland) is better suited to set a correct scheduler.
Of note also is that host-managed like sequential zone devices are also likely
to show up soon with the work being done in the NVMe standard on the new "Zoned
namespace" feature proposal. These devices will also require a scheduler like
mq-deadline guaranteeing per-zone in-order delivery of sequential write
requests. Looking only at the number of queues of the device is not enough to
choose the best (most reasonnable/appropriate) scheduler.
--
Damien Le Moal
Western Digital Research