Re: running of out memory => kernel crash

From: Chris Friesen
Date: Fri Aug 19 2011 - 17:19:49 EST


On 08/19/2011 01:29 PM, Bryan Donlan wrote:
On Thu, Aug 18, 2011 at 10:26, Pavel Ivanov<paivanof@xxxxxxxxx> wrote:

Could you elaborate on this? We have a completely unusable server
which can be revived only by hard power cycling (administrators won't
be able to log in because sshd and shell will fall victims of the same
unending disk reading). And as an alternative we can kill some process
and at least allow administrator to log in and check if something else
can be done to make server feel better. Why is it worse?

I understand that it could be very hard to detect such situation but
at least it's worth trying I think.

Deciding when to call the server unusable is a policy decision that
the kernel can't make very easily on its own; the point when the
system is considered unusable may be different depending on workload.
You could create a userspace daemon, however, that mlockall()s, then
monitors memory usage, load average, etc and kills processes when
things start to go south. You could also use the memory resource
cgroup controller to set hard limits on memory usage.

Indeed. From the point of view of the OS, it's running everything on the system without a problem. It's deep into swap, but it's running.

If there are application requirements on grade-of-service, it's up to the application to check whether those are being met and if not to do something about it.

Chris

--
Chris Friesen
Software Developer
GENBAND
chris.friesen@xxxxxxxxxxx
www.genband.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/