Pre-2.0.31

Doug Ledford (dledford@dialnet.net)
Sat, 17 May 1997 23:25:21 -0500 (CDT)


Ummm....this is kind of scary guys. It needs to be tracked down. I have
a shell server running pre-2.0.31. So far, I think I can report 3
problems left unfixed. I'll do the two not so scary ones first. One,
there is still a socket close bug, but not the same as what other people
reported. IT only happens for me between two pre-2.0.31 machines that are
linked through a somewhat delayed Frame Relay link. On ethernet between
two pre-2.0.31 machines, no problem. Over the same Frame Relay link
(different DLCI) to a 1.2.13-LMP machine, no problem. Note, this socket
stuck in a closing state is on the machine that was listening and did an
accept on the socket, not on the machine originating the socket.
Furthermore, it always loops in a CLOSE state with a timer of 10 seconds,
repeating. Second problem, there appears there might be a slight problem
with signal handling. At first, I thought it was kerneld that was the
problem since it would leave zombie request-route scripts laying around
and ifconfig's on devices that kerneld would need to install would hang.
However, I also found that while running a batch ifconfig on 2 dummy
interfaces with 62 aliases each, occasionally, one of the ifconfig
statements would hang. Doing an ifconfig on the same interface that the
batch run had hung on would result in seeing the properly configured
device and also result in the hung process unhanging. Third problem. I
just got paged by the BB program to tell me that init was down on my shell
server. So, I ssh'ed in, looked at a ps ax, and sure enough, init was
missing. I had just gotten back from going to some local races, so I
chatted with another admin. He told me this happened once earlier tonight
(init was missing from a ps, manually verified) and then it came back on
its own. I thought this was strange, so I went in and took a look in the
/proc filesystem. An ls showed no directory "1". However, I could cd to
"1" and do an ls and get all of the info. After doing this, I went back
out of /proc and did another ps ax. Init still wasn't there. So, I
figured it must be running, but the directory entry for init in /proc has
been lost. I decided to try wakeing init to see what would happen, so I
killed the mingetty processes running on the vts. This of course, sent
signals to init about children. Once I did this, the directory /proc/1
returned and init once again showed up in a ps ax. This one has got me
wondering what in the world could be wrong. This machine had been
upgraded directly from pre-2.0.30-2 to pre-2.0.31. I can say that I never
had the problems with init getting lost, with ifconfig hanging, or kerneld
being broke under pre-2.0.30-2. So, I wonder what it could be and if
these things are related (at least the ifconfig/init problems, probably
not the CLOSE problem)?

*****************************************************************************
* Doug Ledford * Unix, Novell, Dos, Windows 3.x, *
* dledford@dialnet.net 873-DIAL * WfW, Windows 95 & NT Technician *
* PPP access $14.95/month *****************************************
* Springfield, MO and surrounding * Usenet news, e-mail and shell account.*
* communities. Sign-up online at * Web page creation and hosting, other *
* 873-9000 V.34 * services available, call for info. *
*****************************************************************************