bug in kernel...sockets ?

Larice Robert (Robert.Larice@munich.netsurf.de)
Fri, 13 Oct 1995 14:44:42 -0100


Hello,

i have done:
rsh knownhost date
and once more:
rsh knownhost date

the second rsh hangs for let's say about 30 seconds, but then
returns successfully.

if you do:
rsh knownhost date
then wait let's say about ...
and then do again:
rsh knownhost date

there is no hang.

------

then i tried
strace -tTo logfile rsh knownhost date
and have seen, it's a
connect()
system call, which hangs

------

then i did a
netstat -ntc | grep -v 'letsmakeitshorter...'
on one console, and on another i did:
rsh knownhost sleep 14

thats what i get while the sleep, and the rsh runs:

Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address (State) User
tcp 0 0 132.149.117.28:1023 132.149.117.22:514 ESTABLISHED root
tcp 0 0 132.149.117.28:1022 132.149.117.22:1023 ESTABLISHED root

then the rsh finishes, but the connection does not finish:

Proto Recv-Q Send-Q Local Address Foreign Address (State) User
tcp 0 0 132.149.117.28:1022 132.149.117.22:1023 FIN_WAIT2 root
on the remote system a second netstat -tcn shows:
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address (State) User
tcp 2 0 132.149.117.22:1023 132.149.117.28:1021 CLOSE_WAIT root
tcp 2 0 132.149.117.22:514 132.149.117.28:1022 TIME_WAIT root

some time passes and, i will get:

Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address (State) User
tcp 1 0 132.149.117.28:1022 132.149.117.22:1023 TIME_WAIT root
on the remote system a second netstat -tcn shows
Nothing

again some time passes, then the own left connection dissapears too.

the sum of these times is just about the time i have to wait, till a second
rsh runs graciously.

--------------------------------------------------------------------------------

now details:
the host is a 1.2.12 kernel
(i think i have seen 1.2.13 with the same error)
the knownhost is a 1.1.54-kernel, or a sco-kernel

1.1.54 or sco work well with each other, and even with 1.2.12

i have tried the old rsh and the old and working libc from my 1.1.54 system
they produce the same error, if they a run on the 1.2.12 system

i have done netstat -ntc on the 1.1.54 system, there the 2 connections
immediatly disappear, when the rsh returns

------------------------------------

i don't observe any trouble with:
telnet,ping,nfs,samba,X

-------------------------------------

thus my questions:

did i made nonsens ?
is there a error in the networking part of the 1.2.12 kernel ?
( for example not closing network connections, on a program exit())
is there a patch ?
whom should i've had send this report ?

--------------------------------------------------------------------------------
E-mail: Robert.Larice@munich.netsurf.de mail: Robert Larice
Phone: +49-8104-9499 85646 Anzing
Fax: +49-8104-1879 Tannenweg 2
Germany