Re: Understanding I/O behaviour

From: Jesper Juhl
Date: Thu Jul 05 2007 - 16:22:34 EST


On 05/07/07, Martin Knoblauch <spamtrap@xxxxxxxxxxxx> wrote:
Hi,

for a customer we are operating a rackful of HP/DL380/G4 boxes that
have given us some problems with system responsiveness under [I/O
triggered] system load.

The systems in question have the following HW:

2x Intel/EM64T CPUs
8GB memory
CCISS Raid controller with 4x72GB SCSI disks as RAID5
2x BCM5704 NIC (using tg3)

The distribution is RHEL4. We have tested several kernels including
the original 2.6.9, 2.6.19.2, 2.6.22-rc7 and 2.6.22-rc7+cfs-v18.

One part of the workload is when several processes try to write 5 GB
each to the local filesystem (ext2->LVM->CCISS). When this happens, the
load goes up to 12 and responsiveness goes down. This means from one
moment to the next things like opening a ssh connection to the host in
question, or doing "df" take forever (minutes). Especially bad with the
vendor kernel, better (but not perfect) with 2.6.19 and 2.6.22-rc7.

The load basically comes from the writing processes and up to 12
"pdflush" threads all being in "D" state.

So, what I would like to understand is how we can maximize the
responsiveness of the system, while keeping disk throughput at maximum.


I'd suspect you can't get both at 100%.

I'd guess you are probably using a 100Hz no-preempt kernel. Have you
tried a 1000Hz + preempt kernel? Sure, you'll get a bit lower
overall throughput, but interactive responsiveness should be better -
if it is, then you could experiment with various combinations of
CONFIG_PREEMPT, CONFIG_PREEMPT_VOLUNTARY, CONFIG_PREEMPT_NONE and
CONFIG_HZ_1000, CONFIG_HZ_300, CONFIG_HZ_250, CONFIG_HZ_100 to see
what gives you the best balance between throughput and interactive
responsiveness (you could also throw CONFIG_PREEMPT_BKL and/or
CONFIG_NO_HZ, but I don't think the impact will be as significant as
with the other options, so to keep things simple I'd leave those out
at first) .

I'd guess that something like CONFIG_PREEMPT_VOLUNTARY + CONFIG_HZ_300
would probably be a good compromise for you, but just to see if
there's any effect at all, start out with CONFIG_PREEMPT +
CONFIG_HZ_1000.

Hope that helps.

(PS. please don't do crap like using that spamtrap@ address and have
people manually replace it with the one from your .signature when
posting on LKML - it's annoying as hell)

--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/