i386 - alpha LAN problems..

From: jd (fallen@ciaccess.com)
Date: Wed Jan 12 2000 - 22:32:39 EST


ok, first of all, I want to apologize for the length of this message,
and if it's somewhat offtopic on the axp-hardware list i'm sorry
bout that too.

I have two linux puters - my primary i386 box running slackware 7.0
(cokaygne), and a little alpha running redhat 6.0 (bent). (the alpha
is a VX40B-F2 by the way)

cokaygne has an el-cheapo ISAPNP ne2000 card, which after much
frustration
I have _finally_ got working properly. ne2k-diag.c recognizes it, the
kernel ne module recognizes it.. all is well after many headaches on
that front.

bent has a builtin ethernet that does both 10base2 and 10baseT;
obviously
I'm using the 10base2 to match the cheezy ne2k card on cokaygne. this
card
too is fully recognized as the eth0 device on bent, or so it seems.

Anyhow for the time being I am not trying to do anything fancy at all.
No DNS, no gateways, etc. I just want a barebones ultra minimalistic
connection for now; I'm not going to worry about what else I might want
to do with this tiny LAN until I can just manage to send one single
ping from one machine to the other. and so far, at that I have failed.

I am trying to use 192.168.1.0 as my little local net. cokaygne is to
be 192.168.1.1, bent is to be 192.168.1.2.

ok so here's the lowdown (edited for clarity):

 cokaygne:~# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric Ref
Iface
 192.168.1.0 * 255.255.255.0 U 0 0
eth0
 loopback * 255.0.0.0 U 0 0 lo

I won't reproduce bent's because it is exactly the same. the command I
am
running at boot on cokaygne is:
 
 ifconfig eth0 192.168.1.1 netmask 255.255.255.0

the alpha does pretty much the same, as 192.168.1.2.

i think this should be sufficient. but this is what i never fail to get:

 PING 192.168.1.2 (192.168.1.2): 56 data bytes
 112 bytes from cokaygne.milk.org (192.168.1.1): Destination Host
 Unreachable
 Vr HL TOS Len ID Flg off TTL Pro cks Src Dst Data
  4 5 00 5400 0602 0 0000 40 01 4ff5 192.168.1.1 192.168.1.2

I have tried playing around with many different combinations of ifconfig
and route commands, but no luck. I often do "ifconfig eth0 down" and
start
over again, trying something a little different every time, but no luck.
here's some of what I have tried, much of it probably quite wrong (i
continue to use cokaygne as the example, but i am always matching on
bent)

 route add -host 192.168.1.1 eth0

which results in, along with the other route entries:

 Destination Gateway Genmask Flags Metric Ref
Iface
 192.168.1.1 * 255.255.255.255 UH 0 0
eth0

But this fails as well. Just for the record, is there any point in
making
an individual host entry like this? I was under the impression that the
single ifconfig line up above would be sufficient - cokaygne knows that
it
is 192.168.1.1, eth0 knows that it is on the network 192.168.1.0.. that
is
enough for it to get packets out to bent (192.168.1.2), assuming it too
is
configured correctly.. am i wrong? do i really need any -host entries
for
such a simple setup? either way, making one didn't help.

I've tried all sorts of stuff. I've tried adding a -host for 192.168.1.2
(bent) on cokaygne (stupid, i know). i've tried doing a route -net for
192.168.1.0 as a few documents suggest, but this seems to be redundant -
ifconfig seems to have already taken care of this by netmasking
192.168.1.1; using route after it just creates a duplicate entry in the
table for 192.168.1.0.

i've tried specifying the broadcast address (192.168.1.255). I've tried
all this and more, everything i can think of, everything that the NET-3
howto and linux net admin guide seems to suggest for a very basic
two-box
setup like my own. but I continue to get the 'host unreachable' (ping),
'no route to host' (telnet), et cetera.

this has become incredibly frustrating. afaik, this SHOULD be working,
even just with the simple ifconfig eth0 192.168.1.x netmask
255.255.255.0
in place on both ends. but it's not.

a few people have already told me that they do not understand why what
i am doing would NOT be working, that all is kosher in theory...
unfortunately it is not in practice.

let me explain a few more things to help narrow things down.

tho I do indeed have a spotty super-el-cheapo ethernet card on
the i386, I really don't think this is the source of the problem. both
computers' /proc/interrupts values do go up on the host that is pinging;
however there is no corresponding activity on the host being pinged.

I have reason to believe that the problem is with the alpha.

the alpha has both 10base2 and 10baseT on the same ethernet hardware.
is it possible that it is trying to send out via the 10baseT? I have
tried using ifconfig like this:

 [bent]# ifconfig eth0 192.168.1.2 netmask 255.255.255.0 media 10base2

but i get this response:

 SIOCSIFMAP: Operation not supported

I got the alpha on the cheap without dox, without a CDROM, and with
a very minimal RH installation without sources or anything. so
unfortunately I can't rebuild the system now, to be more sure of it's
integrity and make sure the network card is all properly configured.
i'm not even certain what the ethernet card on the alpha is (tulip?),
and don't how else I might ensure that everything is working with it.
         
there is one, maybe significant, maybe meaningless, difference I have
noted between the two machines while I try to ping. the output of
"ifconfig eth0" is different as follows:

 cokaygne (i386) sez:

 eth0 Link encap:Ethernet HWaddr 00:00:B4:3C:2C:1E
          inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:189 (<<< !!)
          TX packets:309 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:10 Base address:0x240

 bent (alpha) sez:

 eth0 Link encap:Ethernet HWaddr 08:00:2B:E4:3C:81
          inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:51 dropped:0 overruns:0 carrier:102
          collisions:0 txqueuelen:100
          Interrupt:10 Base address:0x8800

so, while the output of ping -v is the same ("unreachable"), according
to cokaygne it has indeed sent out the packets without errors. bent
says it has sent no packets, and had nothing but errors.

and actually, I just myself noticed the "frame:189" bit in cokaygne's
RX row, and learned something which is pretty interesting: when I
ping 192.168.1.1 from bent, this frame value (and it alone) DOES GO UP
on
cokaygne. so for the first time I have some proof that something IS
going
on on the wire between the two computers.. so i guess that maybe shoots
down the alpha-is-using-10baseT theory right there..

finally, another strange thing which might have something to do with
all this, is that when I do a "ping localhost" on the alpha, I get this
strange output:

 PING localhost (127.0.0.1): 56 data bytes
 64 bytes from localhost (127.0.0.1): Echo Request

 64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=17.0 ms
 wrong data byte #8 should be 0x8 but was 0x7b
         c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21
22
 23 24 25 26 27 28 29 2a 2b
         2c 2d 2e 2f 0 0 0 0 0 0 0 0 0 0 0 0

this continues.. as the icmp_seq goes up, the only thing changing is the
wrong data byte message (data byte #8 999/1000 times, but ocassionally a
data byte #9 in there):

 wrong data byte #8 should be 0x8 but was 0xf
 wrong data byte #8 should be 0x8 but was 0xdc
 wrong data byte #9 should be 0x9 but was 0x84
 wrong data byte #8 should be 0x8 but was 0xd6

 ...etc etc etc

again, like some of the other weirdness it may just be some alpha
quirk that is not-too-important.. or perhaps it is sending the same
sort of wrong data bytes over eth0, and cokaygne is scoffing at em.. i
really don't know, i'm far from a guru. alot of this is greek to me.

now, having gathered just about all the information I possibly can,
I am completely stuck. I have no idea what the problem is, or what I
can try doing to fix it. if anyhow can draw any inferences from the
above information, i would be more than just happy to hear about em.
if/when i get these machines pinging each other i'll be literally
jumping for joy.

if anyone is willing to help i'd gladly post any other information (ie
output from specific programs / files etc) that might help to
narrow it down. i am willing to try just about anything to make these
computers be friendly with each other.

a big thank you to anyone who actually read through this longwinded
beast of a message. ;)

jd

mesmer@tao.ca
fallen@ciaccess.com
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.rutgers.edu



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:29 EST