Re: Performance patch for NE Ethernet

Richard B. Johnson (root@analogic.com)
Thu, 13 Feb 1997 14:20:32 -0500 (EST)


On Fri, 14 Feb 1997, Paul Gortmaker wrote:

>
> Not to be too harsh on anyone that is trying to help out, but the
> above claim is a bit dubious...
>
> > /* This check _should_not_ be necessary, omit eventually. */
> > - while ((inb_p(NE_BASE+EN0_ISR) & ENISR_RESET) == 0)
> > + while ((inb_p((NE_BASE+EN0_ISR)) & ENISR_RESET) == 0)
>
> Umm, adding extra sets of "( ... )" doesn't gain anything other than
> making it more work for me to read through the patch to see what you
> are trying to change. There are lots of these in this patch...

NE_BASE is a MACRO (#define), ENO_ISR is a MACRO (#define). inb_p() is
a MACRO that defines a compiler-specific "inline function".
We are _required_ to use parenthesis around concatenated macros and
parameters passed to macros to guarantee that the code you specify
is what you get. Whether or not one gets away with sloppy coding using
gcc is not pertainent.

If you bothered to apply the patch to a copy of ne.c, you would see
that I got rid of most of the trash.

>
> > - outb_p(ENISR_RESET, NE_BASE + EN0_ISR); /* Ack intr. */
> > + outb_p(ENISR_RESET, (NE_BASE + EN0_ISR)); /* Ack intr. */
>

> See above...

Ditto;

>
> > - the start of a page, so we optimize accordingly. */
> > + the start of a page, so we optimize accordingly.i
>
> Ummm, vi typo constitutes a patch? (Okay, we all do that now and again...)

It is inside a comment so wasn't detected.

>
> > - /* This *shouldn't* happen. If it does, it's the last thing you'll see */
> > - if (ei_status.dmaing) {
> > + /* This *shouldn't* happen. If it does, we'll fix it. */
> > + if (ei_status.dmaing)
> > + {
>
> No, that is written as such because it most likely *will* be the last
> thing you see, due to the way the hardware is. If the printk gets to the
> screen, then you have done well. Also, putting a { on a newline doesn't
> do anything more than make the patch bigger...

Not true. Absolutely not true.

>
> > + ne_reset_8390(dev);
>
> Well, a post "Oops, shouldn't have done that - reset the card." looks
> like a good idea, I doubt it will buy you a recovery.
>

> > - outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, nic_base+ NE_CMD);
> > + outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, (NE_BASE + NE_CMD));
>
> brackets... NE_BASE macro vs. nic_base will be obscured by GCC.

nic_base was an extra operation. Read the code.
On line 622 there is even "int nic_base = NE_BASE"

>
> > - outb_p(sizeof(struct e8390_pkt_hdr), nic_base + EN0_RCNTLO);
> > + outb_p(sizeof(struct e8390_pkt_hdr), (NE_BASE + EN0_RCNTLO));
>
> brackets...
>
> > - outb_p(0, nic_base + EN0_RCNTHI);
> > - outb_p(0, nic_base + EN0_RSARLO); /* On page boundary */
> > + outw(0, (NE_BASE + EN0_RSARLO)); /* On page boundary */
>
> Bzzt. Can't do that. You have ne1000 cards to deal with, as well as
> the fact that access to the 8390 requires the outb_p() -- see the
> datasheets if you are in doubt.

Oh yes you can. This is not a '286. The 8390 has a chip-to-chip select-time
specification (300ns) that does not apply to the data port or the page
registers. Configuration registers have that limitation because of internal
state-machine considerations. Film at 11.

>
> > - outb_p(ring_page, nic_base + EN0_RSARHI);
> > + outb_p(ring_page, (NE_BASE + EN0_RSARHI));
>
> More brackets...
>
> > - outb_p(E8390_RREAD+E8390_START, nic_base + NE_CMD);
> > + outb_p(E8390_RREAD+E8390_START, (NE_BASE + NE_CMD));
>
> More brackets...
>
> > - insw(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr)>>1);
> > + insw((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr)>>1);
>
> More brackets...
>
> > - insb(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr));
> > + insb((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr));
>
> More brackets...
>
> > - outb_p(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
> > + outb_p(ENISR_RDC, (NE_BASE + EN0_ISR)); /* Ack intr. */
>
> More brackets...
>
> >From here on, you have deleted a bunch of code that is ifdef'd out
> anyways. That doesn't buy you anything in speed or size.

Yes I did.

>
> > - outb_p(count & 0xff, nic_base + EN0_RCNTLO);
> > - outb_p(count >> 8, nic_base + EN0_RCNTHI);
> > + outw(count, (NE_BASE + EN0_RCNTLO));
>
> Nope, sorry, same as above. ne1000 cards, and you need the i/o pause.
> Hitting an 8390 with back-to-back I/O is a good way to fill my mailbox.
>

The film at 11.
Not true on registers that set addresses or the data port registers. It
is not the port reads/writes themselves that cause the problems, but
commands that force internal configuration changes. For instance, if
you even READ the status register while a "remote DMA" is in progress,
the chip's bus interface state-machine will hang your bus forever.
But note that someone else already fixed this problem by using a "reentry"
flag "ei_status.dmaing". After this addition, much of the earlier
attempts to fix this problem were no longer necessary. I have extensively
used the LONGSHINE LCS-8634L (12 machines, 9 only 386's) using the code
changes. This real cheap clone board is one of the worst offenders for
bus-hangs.


> > - outb_p(ring_offset & 0xff, nic_base + EN0_RSARLO);
> > - outb_p(ring_offset >> 8, nic_base + EN0_RSARHI);
> > + outw(ring_offset, (NE_BASE + EN0_RSARLO));
>
> Same as above.


> get you anything in terms of size or performance. When the next
> poor sod has to come along and understand the driver, it may help
> him/her out a lot to be able to read that code, even if it is inactive.
>
> Paul.
> (a.k.a. current unfortunate sod who deals with ne.c)

It gets rid of 4 years worth of unused junk. The _entire_ input routine
becomes:

static void
ne_block_input(struct device *dev, int count, struct sk_buff *skb, int ring_offset)
{
if(ei_status.dmaing)
return;
ei_status.dmaing |= 0x01;
++count;
count &= ~1;
outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, (NE_BASE + NE_CMD));
outw(count, (NE_BASE + EN0_RCNTLO));
outw(ring_offset, (NE_BASE + EN0_RSARLO));

outb_p(E8390_RREAD+E8390_START, (NE_BASE + NE_CMD));
if(ei_status.word16)
insw((NE_BASE + NE_DATAPORT), BUF, count>>1);
else
insb((NE_BASE + NE_DATAPORT), BUF, count);
outb_p(ENISR_RDC, (NE_BASE + EN0_ISR)); /* Ack intr. */
ei_status.dmaing &= ~0x01;
return;
}

The output routine is also shortened AND made clearer. You can't just
look at a patch to determine its validity.

Cheers,
Dick Johnson
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.26 on an i586 machine (66.15 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-