Re: regression: 100% io-wait with 2.6.24-rcX

From: Martin Knoblauch
Date: Fri Jan 18 2008 - 03:19:22 EST


----- Original Message ----
> From: Mel Gorman <mel@xxxxxxxxx>
> To: Martin Knoblauch <spamtrap@xxxxxxxxxxxx>
> Cc: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>; Mike Snitzer <snitzer@xxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; jplatte@xxxxxxxxx; Ingo Molnar <mingo@xxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; "linux-ext4@xxxxxxxxxxxxxxx" <linux-ext4@xxxxxxxxxxxxxxx>; Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; James.Bottomley@xxxxxxxxxxxx
> Sent: Thursday, January 17, 2008 11:12:21 PM
> Subject: Re: regression: 100% io-wait with 2.6.24-rcX
>
> On (17/01/08 13:50), Martin Knoblauch didst pronounce:
> > >
> >
> > The effect is defintely depending on the IO hardware.
> >
performed the same tests
> > on a different box with an AACRAID controller and there things
> > look different.
>
> I take it different also means it does not show this odd performance
> behaviour and is similar whether the patch is applied or not?
>

Here are the numbers (MB/s) from the AACRAID box, after a fresh boot:

Test 2.6.19.2 2.6.24-rc6 2.6.24-rc6-81eabcbe0b991ddef5216f30ae91c4b226d54b6d
dd1 325 350 290
dd1-dir 180 160 160
dd2 2x90 2x113 2x110
dd2-dir 2x120 2x92 2x93
dd3 3x54 3x70 3x70
dd3-dir 3x83 3x64 3x64
mix3 55,2x30 400,2x25 310,2x25

What we are seing here is that:

a) DIRECT IO takes a much bigger hit (2.6.19 vs. 2.6.24) on this IO system compared to the CCISS box
b) Reverting your patch hurts single stream
c) dual/triple stream are not affected by your patch and are improved over 2.6.19
d) the mix3 performance is improved compared to 2.6.19.
d1) reverting your patch hurts the local-disk part of mix3
e) the AACRAID setup is definitely faster than the CCISS.

So, on this box your patch is definitely needed to get the pre-2.6.24 performance
when writing a single big file.

Actually things on the CCISS box might be even more complicated. I forgot the fact
that on that box we have ext2/LVM/DM/Hardware, while on the AACRAID box we have
ext2/Hardware. Do you think that the LVM/MD are sensitive to the page order/coloring?

Anyway: does your patch only address this performance issue, or are there also
data integrity concerns without it? I may consider reverting the patch for my
production environment. It really helps two thirds of my boxes big time, while it does
not hurt the other third that much :-)

> >
> > I can certainly stress the box before doing the tests. Please
> > define "many" for the kernel compiles :-)
> >
>
> With 8GiB of RAM, try making 24 copies of the kernel and compiling them
> all simultaneously. Running that for for 20-30 minutes should be enough
>
to randomise the freelists affecting what color of page is used for the
> dd test.
>

ouch :-) OK, I will try that.

Martin



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/