Re: server about to crash

Peter T. Breuer (ptb@it.uc3m.es)
Wed, 8 Jul 1998 22:06:17 +0200 (MET DST)


"A month of sundays ago Rik van Riel wrote:"
>
> On Wed, 8 Jul 1998, Peter T. Breuer wrote:
> > "A month of sundays ago Alan Cox wrote:"
> > > Ok thats a network leak. Whats the Config.in for the box
> >
> > # Automatically generated by make menuconfig: don't edit
> > CONFIG_IP_ALWAYS_DEFRAG=y
>
> I think I remember some discussions/bugreports about this
> particular function leaking in 2.0.33.

Alan Cox flagged "2.0.33 + CONFIG_BRIDGE".

I daren't say he's not right! But I have been running the identical
kernel (well, the one compiled with 2.8.0) in the machine next to the
server with no problems. I copied the server into it lock stock and barrel
on monday. The only difference is that it has an aha 2940 ultra narrow card,
and the server has a buslogic. The other machine has a 3c900 and the server
has a 3c509b. But they are running the same kernels.

Memory usage is minimal on the sister.

(sister) guitarra:/usr/oboe/ptb% uptime
9:56pm up 1 day, 6:07h, 3 users, load average: 0.07, 0.02, 0.00

(sister) guitarra:/usr/oboe/ptb% free
total used free shared buffers cached
Mem: 63060 61844 1216 15736 25808 21800
-/+ buffers: 14236 48824
Swap: 144576 152 144424

While the server, after 12 hours uptime, was running 30-36M memory. The
servers network buffer use was about 20K and increasing 2K or so an hour.
The sister is using 32 (!!) buffers:

(sister)
Networking buffers in use : 32
Network buffers locked by drivers : 0
Total network buffer allocations : 3845134
Total failed network buffer allocs : 0
Total free while locked events : 1
IP fragment buffer size : 0

> Time to upgrade to .34 or to disable the always-defrag
> option.
>
> Of course, I'm no expert at either the networking code
> or the 2.0 kernels, but you might want to try this...

Definitely! I didn't like the look of that either. My recollection was
that I had not enabled that. It's a masq/router option, is it not?

In other words, I'm not convinced that CONFIG_BRIDGE + 2.0.33 is the bug.
I just tried to run the memory leak detection patched kernel, but
it booted and wouldn't run X, and then sprang FS corruption errors
that it took me some time to fix when I rebooted. Duplicate blocks
on the main 1GB user file system. Looks like open files at the time of
the reboot (I shut down properly by timing CAD right wrt the xdm
flashing X up and down ...).

> Rik.
> +-------------------------------------------------------------------+
> | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
> | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
> +-------------------------------------------------------------------+

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu