EB164/S3/OPPO folklore

Ian Pratt (Ian.Pratt@cl.cam.ac.uk)
Thu, 12 Sep 1996 17:12:20 +0100


I've been trying to get a Digital ATMWorks350 "OPPO" ATM card
to work in an EB164. I discovered a number of problems which
other people might wish to be aware of, even those without
OPPOs...

The OPPO is unusual in that it contains a 21050 PCI-PCI bridge
chip, which the `real' OPPO PCI device lurks behind. This is
done for PCI electrical compliance, something that less
scrupulous card manufactures don't worry about...

Anyway, when the system is booted, a number of bits of software
have a go at discovering the PCI bus topology and configure
the various devices. Unfortunately, some of this software
appears not to have been tested against PCI-PCI bridges.

Putting an OPPO in an EB164 with ebsdk2.0 firmware hangs
the system immediately on boot. Fortunately I don't need
the OPPO to boot, so I just built some firmware that
doesn't attempt to recurse down through multiple PCI buses.
This enabled the system to boot. This bug may be fixed in the
ebsdk 2.1 firmware, but I'm yet to receive my CDROM.

MILO-2.0.12 handles the bridge a bit better. It doesn't crash,
and correctly finds the OPPO (device unknown) on bus 1.
When scanning devices on bus 0, it appears to incorrectly
detect PCI Master Aborts on unoccupied slots, resulting in
it thinking that every unoccupied slot is occupied by
unknown device (80:8000). Fortunately, the configuration
reads it gets back from these bogus cards are so screwed that
it doesn't attempt to allocate the 10GB of memory space
that each `card' requires.

Linux-2.0.18 behaves likewise. Earlier versions of Linux e.g.
1.3.97 which handle Master Aborts by reading the ALCOR
status register instead of mchecking appear to work correctly.

Now for the real bast*rd hard bug_of_the_week:

Our EB164s have Diamond Stealth Video VRAM S3-968 (rev0)
VGA cards in. They appear to be broken in that they
incorrectly decode type-1 CFG cycles to bus 1 as type-0 CFG
cycles to device numbers 0-5. These cycles are differentiated
by AD0, which the card is clearly ignoring. The result is that
both the PCI-PCI bridge and the S3 try driving the bus, resulting
in the PCI state machine locking and a system hang.

Diamond and S3 probably haven't seen this before because:
a. PCI bridges are rare, so they probably haven't seen too
many type-1 CFG cycles.

b. Intel (Triton) machines tend to have their PCI slots mapped
to device numbers 11-15 as opposed to the EB164s 5-9. This
means that you'd need several PCI bridges before you saw
the problem.

Folklore summary:

On an EB164 with a PCI-PCI bridge, the S3 MUST NOT BE DEVICE 5!!
I recommend device number 9, which is the uppermost PCI slot,
labelled as 'PCI 3'.

Cheers,
Ian

PS. Guess which slot number our S3 card was shipped in...