Re: [PATCH] IBM Lanstreamer bugfixes

From: Kent E Yoder (yoder1@us.ibm.com)
Date: Fri Jan 18 2002 - 15:52:47 EST


2) Fixed.

3) Yeah, this was bad... Fixed.

4, 6, 7, 9 & 10) These fall into the category of hardware workaround. The
lanstreamer hardware is essentially broken (with respect to its hardware
specification) and these were a few of the things which made it work.

  For #6, the udelay(1) had more to do with the following write() than
with spin_lock(). When that delay was not there, the write failed
randomly. The same with the udelay(10) at the end of the interrupt
function...

  For #7, 9 & 10, zeroing out interrupts in the interrupt function and in
the xmit function and the delay have no software function at all, but it
made the hardware happy. Without these, the card would lock up the box
under heavy load.

5 & 11) Nope, I had not read Doc/networking/netdevices.txt, so I have a
question. What does being inside rtnl_lock() imply other than the sleep
issues?

  The calls to cli() and save_flags() were wrong from the beginning. They
were imported by the last maintainer since this driver is a modified
version of the olympic token ring driver. The current spin_lock() and
spin_unlock() calls protect the srb_queued variable. If we were to set it
to one and then get interrupted before we actually write() to the srb, the
interrupt function will try to service whatever junk happens to be in the
srb. If going into the interrupt function is covered by rtnl_lock(), this
would be covered, but I guess its not (?)....

8) Yeah, we didn't see any problems with this in test (probably due to
lanstreamer's fairly low bandwidth), but you are right. Fixed.

12) Fixed.

13) Hmm.. Guess I'll need to learn about that... eventually :^). Fixed,
for now...

Kent <yoder1@us.ibm.com>

Kent E Yoder wrote:
> This patch fixes several bugs and works around known hardware problems
> which conspired to lock up the system randomly. Its somewhat large,
> therefore available at:

http://oss.software.ibm.com/developer/opensource/linux/patches/tr/lanstreamer-0.5.1.patch.gz
>
> * Interrupt function rearranged
> * PCI Configuration changed
> * Tx descriptors had to be reduced to 1 (!)
> * Send/Receive operations are nearly serialized

Marcelo, please do not apply this patch...

Sorry for the delay, review:

1) (in code, not in your patch) prefer kernel-standard types like u32 or
u16 to __u32 and __u16

2) poor formatting:

> +#if STREAMER_IOCTL
> + dev->do_ioctl = &streamer_ioctl;
> +#else
> dev->do_ioctl = NULL;
> +#endif

3) I don't like this playing around with magic numbers. pci_set_master
and pci_enable_device and pci_disable_device twiddle PCI_COMMAND bits,
too. Use constants from pci.h make it clear what bits you are clearing,
and what bits you are setting.

> + pci_write_config_byte (pdev, PCI_CACHE_LINE_SIZE, cls);
> + pci_read_config_word (pdev, PCI_COMMAND, &pcr);
> +
> + /* Turn off Fast B2B enable */
> + pcr &= 0xfdff;
> + /* Turn on SERR# enable and others */
> + pcr |= 0x0157;
> +
> + pci_write_config_word (pdev, PCI_COMMAND, pcr);
> + pci_read_config_word (pdev, PCI_COMMAND, &pcr);

4) Your code appears to -always- set cache line size to zero. Is that a
hardware bug? Look at acenic.c to see a better example of setting PCI
cache line size.

5) what is the purpose of the spin_lock in streamer_open, if open is
serialized? it worries me that the previous streamer_open code disabled
interrupts and the new one does not, but replaces with a non-irq-saving
lock that appears superfluous.

6) udelay(1) after brand new spin_lock in streamer_interrupt is
suspicious

7) disabling interrupts by zeroing NIC intr mask, in interrupt handler,
is general not needed. why was this added? interrupt handlers are not
re-entered so this is not a worry.

8) the while loop in the interrupt looks like it could go on for quite a
while under heavy load, starving out a lot of other kernel code. it
needs a work limit at the very least...

9) disabling interrupts at the beginning of each TX is wrong. you
probably want spin_lock_irqsave at critical parts of the xmit.

10) udelay(100) is likely wrong and a sign of a race (perhaps #9, above,
fixes this)

11) replacing save_flags/cli with normal spin_lock in streamer_close is
suspicious and likely wrong. See issue #5 about streamer_open
serialization. Have you read Documentation/networking/netdevices.txt ?

12) formatting of streamer_ioctl is grossly different from the rest of
the code

13) SIOCDEVPRIVATE ioctls are going away in 2.5. (you can implement
include/linux/ethtool.h SIOCETHTOOL interface for lot of the
functionality, though)

-- 
Jeff Garzik      | Alternate titles for LOTR:
Building 1024    | Fast Times at Uruk-Hai
MandrakeSoft     | The Took, the Elf, His Daughter and Her Lover
                 | Samwise Gamgee: International Hobbit of Mystery

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jan 23 2002 - 21:00:29 EST