Re: 2.2.0-pre1/Token-Ring/IPX = bad

Gilbert Ramirez Jr. (gram@verdict.uthscsa.edu)
Fri, 1 Jan 1999 01:37:34 -0600 (CST)


As Alan Cox said:
>
> > Once I attempt an 'slist' with this kernel, the Linux box keeps sending NCP
> > packets to my nearest file server. I sent > 80K packets in 5 seconds.
>
> Interesting. The NCP packets come from the user space daemon, so it could
> be an NCP daemon originated problem, or maybe a driver problem. Im not sure
> IPX itself can trip this.
>
> Do you have a box with an IBM tr card you can try to eliminate IPX. Also can
> you strace the "slist" and see if strace shows it bersekely sending 80K packets

I will have to wait to go back into work to try ibmtr; I have the card but
I have to physically plug it into the network.

However, I did try the "strace slist 2> out" test, since I could do that
remotely.

With kernel 2.0.36, my output file was 490 lines long (mostly select() and
recv()'s). With 2.2.0-pre2, the slist completed instead of hanging (perhaps
because there's less network traffic at work at midnight), but my slist
output is 1946 lines long, again select()'s and recv()'s.
These two tests were done back-to-back, so there should be no traffic
change on the network.

grep -c select output-2.0.36 = 152 select() calls
grep -c select output-2.2.0-pre2 = 880 select() calls

Something is amiss. 2.2.0 requires 5 times as many select() calls to
complete the same slist.

To give you a flavor of the output files, here's the last 20 lines of each
file. Notice the much larger number in the left { x, y } statement in the
2.2.0 strace. Of what relevance is that number? The entire 2.2.0 strace
output has that number (990000).

`tail -20 output-2.0.36`
------------------------
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33I\31\1\0\0\0\233\260\250\305\0"..., 4070, 0) = 138
write(1, "UH-DATA1 "..., 75) = 75
send(3, "\"\"J\31\1\0\27\0\t7g\22\0\32\0\4"..., 18, 0) = 18
select(4, [3], [], [], {3, 0}) = 1 (in [3], left {3, 0})
recv(3, "33J\31\1\0\374\0", 8, 0x2) = 8
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33J\31\1\0\374\0", 4070, 0) = 8
send(3, "UUK\31\1\0\0", 7, 0) = 7
select(4, [3], [], [], {3, 0}) = 1 (in [3], left {3, 0})
recv(3, "33K\31\1\0\0\0", 8, 0x2) = 8
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33K\31\1\0\0\0", 4070, 0) = 8
close(3) = 0
close(4) = 0
kill(270, SIGTERM) = 0
--- SIGCHLD (Child exited) ---
wait4(-1, NULL, 0, NULL) = 270
munmap(0x400ca000, 4096) = 0
_exit(0) = ?

`tail -20 output-2.2.0-pre2`
----------------------------
send(3, "UUK\4\1\0\0", 7, 0) = 7
select(4, [3], [], [], {3, 0}) = 1 (in [3], left {2, 990000})
recv(3, "33J\4\1\0\374\0", 8, 0x2) = 8
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33J\4\1\0\374\0", 8, 0) = 8
select(4, [3], [], [], {3, 0}) = 1 (in [3], left {2, 990000})
recv(3, "33J\4\1\0\374\0", 8, 0x2) = 8
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33J\4\1\0\374\0", 8, 0) = 8
select(4, [3], [], [], {3, 0}) = 1 (in [3], left {3, 0})
recv(3, "33K\4\1\0\0\0", 8, 0x2) = 8
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {1, 0})
recv(3, "33K\4\1\0\0\0", 4070, 0) = 8
close(3) = 0
close(4) = 0
kill(569, SIGTERM) = 0
--- SIGCHLD (Child exited) ---
wait4(-1, NULL, 0, NULL) = 569
munmap(0x400ca000, 4096) = 0
_exit(0) = ?

I will attempt the ibmtr vs. oltr test as soon as possible.

Just to remind other readers on what's happening, my 2.2.0-pre2 machine
spews NCP (IPX) packets at an incredible rate (at least 3K packets/sec)
when using 'slist' with the Olicom TR driver. I don't know who is the
culprit yet. 2.0.36 behaves. Unfortunately, I had not tried any 2.1.x
kernels.

--gilbert

-- 
Gilbert Ramirez                Voice:  +1 210 358 4032
Technical Services             Fax:    +1 210 358 1122
University Health System       San Antonio, Texas, USA

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/