(cc lkml restored, with permission)
On Wed, 14 Nov 2007 10:48:10 -0500 "Alan D. Brunelle" <Alan.Brunelle@xxxxxx> wrote:
Andrew Morton wrote:
On Mon, 15 Oct 2007 16:13:15 -0400Hi folks -
Rik van Riel <riel@xxxxxxxxxx> wrote:
Since you have been involved a lot with ext3 development,Gee. Expect the unexpected ;)
which kinds of workloads do you think will show a performance
degradation with Arjan's patch? What should I test?
One problem might be when kjournald is doing its ordered-mode data
writeback at the start of commit. That writeout will now be
higher-priority and might impact other tasks which are doing synchronous
file overwrites (ie: no commits) or O_DIRECT reads or writes or just plain
old reads.
If the aggregate number of seeks over the long term is the same as before
then of course the overall throughput should be the same, in which case the
impact might only be upon latency. However if for some reason the total
number of seeks is increased then there will be throughput impacts as well.
So as a starting point I guess one could set up a
copy-a-kernel-tree-in-a-loop running in the background and then see what
impact that has upon a large-linear-read, upon a
read-a-different-kernel-tree and upon some database-style multithreaded
O_DIRECT reader/overwriter.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
I noticed this thread recently (sorry, month late and a dollar short), but it was very interesting to me, and as I had some cycles yesterday I did a quick run of what Andrew suggested here (using 2.6.24-rc2 w/out and then w/ Ajran's patch). After doing the runs last night I'm not overly happy with the test setup, but before redoing that, I thought I'd ask to make sure that this patch was still being considered?
I'd consider its status to be "might be a good idea, more performance
testing needed".
And, btw, the results from the first pass were rather mixed -
ooh, more performance testing. Thanks
* The overwriter task (on an 8GiB file), average over 10 runs:
o 2.6.24 - 300.88226 seconds
o 2.6.24 + Arjan's patch - 403.85505 seconds
* The read-a-different-kernel-tree task, average over 10 runs:
o 2.6.24 - 46.8145945549 seconds
o 2.6.24 + Arjan's patch - 39.6430601119 seconds
* The large-linear-read task (on an 8GiB file), average over 10 runs:
o 2.6.24 - 290.32522 seconds
o 2.6.24 + Arjan's patch - 386.34860 seconds
These are *large* differences, making this a very signifcant patch. Much
care is needed now.
Could you expand a bit on what you're testing here? I think that in one
process you're doing a continuous copy-a-kernel-tree and in the other
process you're the above three things, yes?
I guess the other things we should look at are the impact on the
continuously-copy-a-kernel-tree process and also the overall IO throughput.
These things will of course be related. If the overall system-wide IO
throughput increases with the patch then we probably have a no-brainer. If
(as I suspect) the overall IO throughput is decreased then this will be a
difficult call.
I'll get results out when I have the changes made to the script (outlined above), and the runs done.This was performed on a 4-way IA64 w/ 4GiB RAM w/ a FC disk containing an Ext3 (4KiB block) FS. The reason I woke up this morning and wasn't too happy with the benchmarking harness is that a kernel source tree is less than 1GiB and size, and thus the back ground task of reading it might all fit in the page cache. To do this right, I'd use another tree that has >4GiB of data stored in it. (And likewise, I'd change the read-a-different-kernel-tree task to use a larger tree as well.)
hm, yes. Back in the days when I used to do useful things I'd do most
testing of this sort on 256MB, 128MB or even 64MB machines. So that data
would get tossed out of cache quickly so that I could use smaller working
sets so that the tests could be executed faster.
But before I do that, I want to make sure that the path is still being considered.
Sure, it hasn't been ruled out. Especially as those time deltas you're
measuring are so large. We haven't seen changes in IO throughput like that
in years. We just need to work out if they're net-positive or net-negative ;)
This will end up being a pretty large hunk of work I expect.