Swap problems (was 3 errors/suggestions (pre-2.0.x))

Bernhard Heidegger (hdg@edvz.tu-graz.ac.at)
Mon, 3 Jun 1996 19:19:10 +0200 (MET DST)


Hi!

Since I didn't get back my original mails to the maillinglist I
attach it at the end.

The machine runs more a less 5 types of (server-) programs:
3 of them are started once and don't fork
2 of them are started and fork themselves (very often)
(The 2 forking processes communicate with the "outside world" and the
the 3 nonforking processes over TCP connections.)
>From time to time the kernel writes the following messages:
kernel: Hmm.. Trying to use unallocated swap (........)
kernel: swap_duplicate: trying to duplicate unused page
kernel: swap_free: swap-space map bad (entry ........)
Then one of the 2 forking processes (completly different programs)
didn't work anymore; the fork() call work's but the child get a segv
in the first new() call. The parent process (and "old" children)
didn't have a problem (the parent process only accept() and fork())
If I kill the parent process (a new program of that type will be
restarted) all seems ok; the swapping errors go away after a short
time (say 1 minute); maybe one of the "old" children cause the delayed
swap errors (each child exit after a timeout).

old mails:

Asus P55TP4N with 512kB sync cache
128 MB RAM
Miro 12SD (only console mode)
Adaptec AHA2940UW
2x Quantum XP34300W (4GB FastWide)
NEC CD-ROM DRIVE:222
DAT HP HP35480A
3Com 3C509 (active)
3Com 3C590 (not active)

slackware 3.0 based ELF installation :-(but the server programs are a.out)
kernel version 1.99.9
libc 5.2.18 (a.out 4.7.2)
server programs are C++ code compiled with gcc/g++ 2.6.3 and linked
against libc-4.6.27

1.) I got many (over 5000 in 3 days) of the following messages with
1.99.9 (also with 1.3.97); the messages are sorted and uniq
kernel: swap_duplicate: trying to duplicate unused page
kernel: swap_free: swap-space map bad (entry 00033200)
[18 lines with different addresses deleted]

The machine is a type of WWW server (with database); sometimes there
are up to about 600 processes and a cat /proc/sys/kernel/*-nr gives
2432 (file-nr)
5840 4962 (inode-nr)
(Yes, I changed NR_INODE to 8192, NR_FILE to 4096 and NR_TASKS to 2048)
The problem is, that I cannot reproduce the swap errors; I wrote a little
C program, which obtain much memory (200 MB), write to that memory, read
and free it -> no swap errors :-(
It seems, that one of the server processes didn't work when the swapping
errors begin (this processes does fork()'s for each request; the fork
seems to work, but then the child get a segv)

2.) Is it possible to raise the fd limit per process before 2.0? IMHO 256
fd's per process is too less today.

3.) sys_socket calls get_fd; IMHO, if that fail (because of ENFILE or EMFILE)
the return value should be E[NM]FILE not EINVAL.

Let me know if I can help tracking down the problem
(I will try pre2.0.12 today)

Thanks in advance

Bernhard.

---
+----------------------------+-------------------------------+
|   hdg@edvz.tu-graz.ac.at   |   bheide@iicm.tu-graz.ac.at   |
+----------------------------+-------------------------------+
| Bernhard Heidegger, Graz University of Technology, Austria |
+------------------------------------------------------------+
Worst day playing is better than best day working!