Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

From: Michal Rostecki
Date: Wed Feb 10 2021 - 07:32:20 EST


On Wed, Feb 10, 2021 at 05:24:28AM +0100, Michał Mirosław wrote:
> On Tue, Feb 09, 2021 at 09:30:40PM +0100, Michal Rostecki wrote:
> [...]
> > For the array with 3 HDDs, not adding any penalty resulted in 409MiB/s
> > (429MB/s) performance. Adding the penalty value 1 resulted in a
> > performance drop to 404MiB/s (424MB/s). Increasing the value towards 10
> > was making the performance even worse.
> >
> > For the array with 2 HDDs and 1 SSD, adding penalty value 1 to
> > rotational disks resulted in the best performance - 541MiB/s (567MB/s).
> > Not adding any value and increasing the value was making the performance
> > worse.
> >
> > Adding penalty value to non-rotational disks was always decreasing the
> > performance, which motivated setting it as 0 by default. For the purpose
> > of testing, it's still configurable.
> [...]
> > + bdev = map->stripes[mirror_index].dev->bdev;
> > + inflight = mirror_load(fs_info, map, mirror_index, stripe_offset,
> > + stripe_nr);
> > + queue_depth = blk_queue_depth(bdev->bd_disk->queue);
> > +
> > + return inflight < queue_depth;
> [...]
> > + last_mirror = this_cpu_read(*fs_info->last_mirror);
> [...]
> > + for (i = last_mirror; i < first + num_stripes; i++) {
> > + if (mirror_queue_not_filled(fs_info, map, i, stripe_offset,
> > + stripe_nr)) {
> > + preferred_mirror = i;
> > + goto out;
> > + }
> > + }
> > +
> > + for (i = first; i < last_mirror; i++) {
> > + if (mirror_queue_not_filled(fs_info, map, i, stripe_offset,
> > + stripe_nr)) {
> > + preferred_mirror = i;
> > + goto out;
> > + }
> > + }
> > +
> > + preferred_mirror = last_mirror;
> > +
> > +out:
> > + this_cpu_write(*fs_info->last_mirror, preferred_mirror);
>
> This looks like it effectively decreases queue depth for non-last
> device. After all devices are filled to queue_depth-penalty, only
> a single mirror will be selected for next reads (until a read on
> some other one completes).
>

Good point. And if all devices are going to be filled for longer time,
this function will keep selecting the last one. Maybe I should select
last+1 in that case. Would that address your concern or did you have any
other solution in mind?

Thanks for pointing that out.

> Have you tried testing with much more jobs / non-sequential accesses?
>

I didn't try with non-sequential accesses. Will do that before
respinning v2.

> Best Reagrds,
> Michał Mirosław

Regards,
Michal