Re: [patch 0/3 v2] raid5: make stripe handling multi-threading

From: NeilBrown
Date: Tue Aug 27 2013 - 03:34:46 EST


On Mon, 12 Aug 2013 10:18:03 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:

> Neil,
>
> This is another attempt to make raid5 stripe handling multi-threading.
> Recent workqueue improvement for unbound workqueue looks very promising to the
> raid5 usage. I had details in the first patch.
>
> The patches are against your tree with patch 'raid5: make release_stripe
> lockless' and 'raid5: fix stripe release order' but without 'raid5: create
> multiple threads to handle stripes'
>
> My test setup has 7 PCIe SSD, chunksize 8k, stripe_cache_size 2048. If enabling
> multi-threading, group_thread_cnt is set to 8.
>
> randwrite throughput(ratio) unpatch/patch requestsize(sectors) unpatch/patch
> 4k 1/5.9 8/8
> 8k 1/5.5 16/13
> 16k 1/4.8 16/13
> 32k 1/4.6 18/14
> 64k 1/4.7 17/13
> 128k 1/4.2 23/16
> 256k 1/3.5 41/21
> 512k 1/3 75/28
> 1M 1/2.6 134/34
>
> For example, in 1M randwrite test, patched kernel is 2.6x faster, but average
> request sending to each disk is drop to 34 sectors from 134 sectors long.
>
> Currently the biggest problem is request size is dropped, because there are
> multiple threads dispatching requests. This indicates multi-threading might not
> be proper for all setup, so I disable it by default in this version. But since
> throughput is largly improved, I thought this isn't a blocking issue. I'm still
> working on improving this, maybe schedule stripes from one block plug as a
> whole.
>
> Thanks,
> Shaohua


Thanks. I like this a lot more than the previous version.

It doesn't seem to apply exactly to my current 'for-next' - probably because
I have moved things around and have a different set of patches applied :-(
If you could rebase it on my current for-next I'll apply it and probably
submit for next merge window.
A couple of little changes I'd like made:

1/ alloc_thread_groups need to use GFP_NOIO, it least when called from
raid5_store_group_thread_cnt. At this point in time IO to the RAID5 is
stalled so if the malloc needs to free memory it might wait for writeout to
the RAID5 and so deadlock. GFP_NOIO prevents that.

2/ could we move the

+ if (!cpu_online(cpu)) {
+ cpu = cpumask_any(cpu_online_mask);
+ sh->cpu = cpu;
+ }

inside raid5_wakeup_stripe_thread() ?

It isn't a perfect fit, but I think it is a better place for it.
It could check list_empty(&sh->lru) and if it is empty, add to
the appropriate group->handle_list.

The code in do_release_stripe would become
else {
clear_bit(STRIPE_DELAYED, &sh->state);
clear_bit(STRIPE_BIT_DELAY, &sh->state);
+ if (conf->worker_cnt_per_group == 0) {
+ list_add_tail(&sh->lru, &conf->handle_list);
+ } else {
+ raid5_wakeup_stripe_thread(sh);
+ return;
+ }
}
md_wakeup_thread(conf->mddev->thread);

??

NeilBrown

Attachment: signature.asc
Description: PGP signature