Re: process reaper

Jonathan A. Davis (jonathan@evergreen.cc.usm.edu)
Sat, 17 Feb 1996 03:24:50 -0600 (CST)


On Sat, 17 Feb 1996, Jon Lewis wrote:

> I applied the exit.c patch to my kernels...but it seemed to have little
> or no effect, and people still leave runaway processes several times a
> day. Finally, I felt inspired to write a script to do something about
> it. This has only been through minimal testing. I've watched it kill a
> runaway "less" a few times.
>

We have the same problem. In fact, we are being driven a little nuts by
the process hangs. With a 5 (about to be six) machine central server
cluster, 6000+ accounts and 250+ simultaneous users, its becoming more
than we can handle, even with automated scripts.

I'd love to work with a member in the core development group to track down
the problem, up to and including guest accounts on the systems and pulling
a node out of the cluster for experimentation.

The only problem is that the problem appears to be either on the
back-burner *way back* or being outright ignored. The only time I've even
seen it alluded to was the original exit.c patch. I've mentioned the
problem in several posts to BIG-LINUX and linux-kernel, but these have
never been acknowledged.

Does this mean that no one else is experiencing the problem? I'm certain
there are many installations that are larger than ours and should be
seeing this problem. If not, what (with our Slackware-based ELF & a.out
installations) have we done wrong? If a problem does actually exist,
isn't it worth addressing before calling a kernel release 2.0?

A little background:

We've been running Linux full-time on machines since .99PL4, including
servers carrying netnews, DNS, WWW, CSO/PH, mailhubs, routers, etc,
etc...., and large labs dedicated to Linux workstations running everything
from IRC to Maple. Our latest project is connection of some outlying
locations using HDSL, v.35 cards and Linux. Thus, we are not exactly
newcomers to Linux. So forgive me if I tend to get a little antsy when
talk starts concerning alternative OS's to fix the "cluster problem". ;-)

Just some (no doubt sleep induced) musings at 3:00am,

-Jonathan _ _
------------------------------------------------------------->>>>>>>>-(o)(o)---
Jonathan A. Davis | Academic Systems Analyst | Hattiesburg/Gulf Park/Stennis
Computing Center | Box 5171 | 39401-5171 | (601) 266-4103 | davis@cc.usm.edu
http://evergreen.cc.usm.edu/~davis | Linux: The choice of a GNU generation