Re: Ultra Sparc NCR SCSI driver -> SIR Reset

Gerard Roudier (groudier@club-internet.fr)
Tue, 6 Jul 1999 22:22:45 +0200 (MET DST)


On Tue, 6 Jul 1999 pau@readysoft.es wrote:

> I'm trying to setup an Ultra 10 Sparc machine with RedHat 6.0, but when I
> stress the disk load (f.i. with squid) it hangs with an steady error:
> SIR Reset

I cannot trust any PCI device driver to work reliably on some SUN
hardware, given what I read about the way SUN guys implement some PCI-HOST
bridge of their.

I read the following about some stuff they call a PCI-HOST bridge:

1 - PCI IO write are posted.
The PCI specs says PCI IO writes must NOT be posted.

2 - Posted writes are NOT flushed on PCI read transactions.
The PCI specs says that posted writes must complete before the
PCI read completes. This ordering rule is MANDATORY. It is not
an option, AT ALL!!

3 - Write buffers are flushed before the IRQ is delivered.
This is useless and stupid since when the IRQ is shared, it is just
ored by the INT lines of the devices. So, a completion for device B
that occurs when the IRQ has been raised by device A will not flush
anything.

With such FLAWED bridge, I cannot beleive that PCI device drivers can
operate reliably. They may seem to work as long as the latency due to
posting is low enough, but will break some legal assumptions of PCI device
drivers if it get too high.

Speaking of the ncr/sym drivers. The only way to make the driver working
with broken SUN bridges could be the following (at least).

1 - Do NOT share the IRQ with another device.

2 - Donnot use INTFLY, but stop the SCRIPTS until the C code has
caught the completion and restart the SCRIPTS from the C code.

(1) ensures that all posted writes will be flushed when the C code will
examine the status and signal the completion.

(2) will avoid the controller to unintentionnaly confuse the C code by
wrong partial writes to memory, or some writes to IO registers being
delayed when the driver assumes it should have completed, due to posted
writes wrongly not being flushed before a PCI read completes.

I do consider such an exotic implementation of PCI to be absolutely NOT
PCI.

For this reason, I am not going to support SUN hardware that are so
stupidly broken for PCI with the ncr/sym drivers.
Note that if I ever receive patches for that, I will obviously accept
them.

> I've tried also RedHat 5.2 and the latest Redhat-6.0 RPMS (kernel
> 2.2.5-22), but it keeps on hanging :(
>
> Can anybody give me a clue of how to fix this error?
> In the driver documenbtation it talks about sym53c8xx instead of
> ncr53c8xx, but it doesn't exist for Sparc.

Perhaps your problem is not due to the problem I talk about, but if you
are using an hardware that corresponds to my description, beleive me, you
may experience very weird problems with quite correct PCI device drivers
that addresses real PCI and not some stupid broken variant.

Gérard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/