Re: Bug#250477: kernel-source-2.4.26: Lots of debug in RAID5

From: Neil Brown
Date: Tue May 25 2004 - 01:32:16 EST


On Monday May 24, nathans@xxxxxxx wrote:
> On Sun, May 23, 2004 at 07:16:22AM -0400, hch@xxxxxxxxxxxxx wrote:
> > On Sun, May 23, 2004 at 08:53:51PM +1000, Herbert Xu wrote:
> > > > --- kernel-source-2.4.26/drivers/md/raid5.c 2003-08-30 06:01:38.000000000 +0000
> > > > +++ kernel-source-2.4.26-nodebug/drivers/md/raid5.c 2004-05-23 08:54:36.000000000 +0000
> > > > @@ -282,7 +282,7 @@
> > > > }
> > > >
> > > > if (conf->buffer_size != size) {
> > > > - printk("raid5: switching cache buffer size, %d --> %d\n", oldsize, size);
> > > > + PRINTK("raid5: switching cache buffer size, %d --> %d\n", oldsize, size);
> > > > shrink_stripe_cache(conf);
> > > > if (size==0) BUG();
> > > > conf->buffer_size = size;
> > >
> > > Thanks for the patch. This does indeed look like a typo.
> > >
> > > Hi Neil, does this patch look OK to you?
> >
> > No, this was rejected a few times already. The problem is that XFS
> > uses differen I/O sizes for the log and other I/O which makes raid
> > performance suck really badly. The real fix is to use the v2 XFS log
> > format when using software raid5.
>
> What is really wanted is the -ssize=4096 option to mkfs.xfs.
> This does the 4k aligned log IO Christoph is talking about
> here, and also sizes a few other XFS ondisk structures such
> that there is no I/O to the device that is not 4K aligned.
>
> Neil, I wonder if we could make the message more informative?
> Maybe some words about a suboptimal filesystem configuration,
> or something to that effect.

That would possibly be reasonable.
I would only want the extra message if the 'switching cache buffer
size' messages happened at a high rate, and occasional individual
messages are not an indication of a problem at all.

Maybe something like the following, but the recent/cnt/warned should
really be per-array.

NeilBrown
===========================================================
convert high-frequency raid5 warning to more specific warning.

Atleast 'warned', and possible cnt,recent should be per-array.

Signed-off-by: Neil Brown <neilb@xxxxxxxxxxxxxxx>

----------- Diffstat output ------------
./drivers/md/raid5.c | 16 +++++++++++++++-
1 files changed, 15 insertions(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~ 2004-05-25 16:22:56.000000000 +1000
+++ ./drivers/md/raid5.c 2004-05-25 16:27:54.000000000 +1000
@@ -282,7 +282,21 @@ static struct stripe_head *get_active_st
}

if (conf->buffer_size != size) {
- printk("raid5: switching cache buffer size, %d --> %d\n", oldsize, size);
+ static long recent;
+ static int cnt;
+ static int warned;
+ if (time_after(recent+HZ, jiffies))
+ cnt++;
+ else {
+ recent = jiffies;
+ cnt = 0;
+ }
+ if (cnt > 50 && ! warned) {
+ printk("raid5: WARNING:array used in unsupported configuration, expect poor performance\n");
+ warned = 1;
+ }
+ if (!warned)
+ printk("raid5: switching cache buffer size, %d --> %d\n", oldsize, size);
shrink_stripe_cache(conf);
if (size==0) BUG();
conf->buffer_size = size;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/