RE: Strange load spikes on 2.4.19 kernel

From: Joseph D. Wagner (wagnerjd@prodigy.net)
Date: Sun Oct 13 2002 - 02:01:44 EST


I'll let you in on a dirty little secret. The Linux file system does
not utilize SMP. That's right. All file processes go through one and
only one processor. It has to do with the fact that the Linux kernel is
a non-preemptive kernel.

Linus, in his infinite wisdom, made a "strategiery" decision that it
would be better for one process to be able to grind your machine to a
halt than to redo and rework sections of the kernel that don't allow for
preemption.

Try switching kernels to the Linux Kernel Preemption Project:
http://sourceforge.net/projects/kpreempt

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Rob Mueller
Sent: Sunday, October 13, 2002 1:34 AM
To: Mark Hahn
Cc: linux-kernel@vger.kernel.org; Jeremy Howard
Subject: Re: Strange load spikes on 2.4.19 kernel

We've just discovered that this is actually happening on another one of
our
machines as well. This machine uses 2.4.18 kernel, ext3 and has 2 SCSI
drives and 2 IDE drives which hold the main user mailbox data. Also it's
a
P3 with a completely different motherboard rather than an Athlon, so it
doesn't seem to be hardware related in that respect.

> well, it's conceivable that if something is blocking a bunch
> of procs, it would also block your shell and ps, so not show up.
> "vmstat 1" might work better, though it's munging plenty of
> /proc files so is hardly immune to that.

But the ps/shell would only use CPU time, and there's plenty of that
available. Unless it was something everything would block on... what
could
that be? A spin lock or something? Some interrupt routine? Would that
even
result in other processes being counted as blocked process?

Also Let me do a calculation, though I have no idea if this is right or
not...
a) the first item in the uptime output is 'system load average for the
last
1 minute'
b) it seems to only update/recalculate every 5 seconds
c) it jumps from < 1 to 20 in 1 interval (eg 5 seconds)

This means that for it to jump from < 1 to 20 in 5 seconds, there must
be on
average about 60/5 * 20 = 240 processes blocked over those 5 seconds
waiting
for run time of some sort for the load to jump 20 points. Is that right?

> but it's worth asking: do you notice a hiccup other than by looking
> at the loadav? that is, suppose the loadav is simply miscalculated...

Well yes, there definitely does seem to be a performance hit on the
whole
system when the load jumps, everything feels significantly more
'sluggish'
during the spikes is the best I can describe it right now...

Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Oct 15 2002 - 22:00:45 EST