X server and OOM kill

David Luyer (luyer@ucs.uwa.edu.au)
Thu, 13 Aug 1998 00:23:54 +0800


Another problem with X which cannot be fixed is out-of-memory killing
and restoration of state.

There's a bug which bites me every now and then
(something to do with doing a 'find text' in netscape under X with
afterstep and with both the "magic options" set using XF86_SVGA
and a very large document, usually a Cisco CD document) which causes
the X server (!!) process to bloat out and die (if I kill netscape
while the machine is going mad from a remote telnet, the X server
does not reclaim the memory either). This hasn't happened to me
for about a month now, but that's quite possibly since I now avoid
searching the documents in that way (I use grep then go to the
location instead). Under afterstep I could fix the problem by
removing what they called the "magic voodoo options" but windowmaker
offers no fix.

Anyway I agree this is a buggy application problem (where the
applicatoin is XF86_SVGA being triggered by something in
netscape) but my point is, if X is killed due to an out-of-memory
situation, it gets zero, count them, zero chances to restore the
video state. While it's not a normal situation, it can happen.

The solution?
1) reliable X servers
2) video cards which you don't need to know the state for!

If you are complaining about X dying for no apparent reason on
a card which it is impossible for you to restore without knowing
state, what you are saying is...
"Linux should guard me against application bugs which my hardware
bugs mean I can't get a userspace app to guard me against".

Even then that isn't quite true. You could store a "current
state" in a SYSVSHM segment and whenever making a change which
could put the system in a non-recoverable state record the new
state there.

ie; daemon sits watching xserver or watching nattach on a shm
segment. xserver always keeps shm segment up to date with
restoration instructions. when executing a critical state
change, xserver writes to the segment giving intent to do this,
does it, writes that it is done. if it dies when the segment
does not indicate it's in the middle of a state change, the
daemon just restores. if it dies in the middle of a state
change, the daemon does something smart possibly requiring some
kind of user input(?).

I'm not sure exactly. But this is userspace and at worst
reduces it from an easy to cause problem to a race condition.

David.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html