Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

From: Michał Mirosław
Date: Wed Feb 10 2021 - 21:28:57 EST


On Wed, Feb 10, 2021 at 07:23:04PM +0000, Michal Rostecki wrote:
> On Wed, Feb 10, 2021 at 01:58:15PM +0100, Michał Mirosław wrote:
> > On Wed, Feb 10, 2021 at 12:29:25PM +0000, Michal Rostecki wrote:
> > > On Wed, Feb 10, 2021 at 05:24:28AM +0100, Michał Mirosław wrote:
> > > > This looks like it effectively decreases queue depth for non-last
> > > > device. After all devices are filled to queue_depth-penalty, only
> > > > a single mirror will be selected for next reads (until a read on
> > > > some other one completes).
> > > >
> > >
> > > Good point. And if all devices are going to be filled for longer time,
> > > this function will keep selecting the last one. Maybe I should select
> > > last+1 in that case. Would that address your concern or did you have any
> > > other solution in mind?
> >
> > The best would be to postpone the selection until one device becomes free
> > again. But if that's not doable, then yes, you could make sure it stays
> > round-robin after filling the queues (the scheduling will loose the
> > "penalty"-driven adjustment though).
>
> Or another idea - when all the queues are filled, return the mirror
> which has the lowest load (inflight + penalty), even though it's greater
> than queue depth. In that case the scheduling will not lose the penalty
> adjustment and the load is going to be spreaded more fair.
>
> I'm not sure if postponing the selection is that good idea. I think it's
> better if the request is added to the iosched queue anyway, even if the
> disks' queues are filled, and let the I/O scheduler handle that. The
> length of the iosched queue (nr_requests, attribute of the iosched) is
> usually greater than queue depth (attribute of the devide), which means
> that it's fine to schedule more requests for iosched to handle.
>
> IMO btrfs should use the information given by iosched only for heuristic
> mirror selection, rather than implement its own throttling logic.
>
> Does it make sense to you?
>
> An another idea could be an additional iteration in regard to
> nr_requests, if all load values are greater than queue depths, though it
> might be an overkill. I would prefer to stick to my first idea if
> everyone agrees.

What if iosched could provide an estimate of request's latency? Then
btrfs could always select the lowest. For reads from NVME/SSD I would
normally expect something simple: speed_factor * (pending_bytes + req_bytes).
For HDDs this could do more computation like looking into what is there
in the queue already.

This would deviate from simple round-robin scheme, though.

Best Regards
Michał Mirosław