Re: atl1 64-bit => 32-bit DMA borkage (reproducible, bisected)

From: Jay Cliburn
Date: Sun May 04 2008 - 20:31:50 EST


On Mon, 5 May 2008 01:15:07 +0400
Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:

> Looking at how other netdevice drivers:
>
> 8139too and others checks netif_running() in interrupt handler.
>
> r8169 has scary "50k$" question comment re irqs disabled after
> interacting with hardware.
>
> But the r8169 case should be fixed by atlx_irq_disable()?

Agreed.

>
> Writes to REG_IMR, REG_ISR are commented in atl1_reset_hw(), why?

Came from the vendor that way.

> (I'll test that soon)
>
> Do we have a theory why changing from 64-bit DMA mask to 32-bit mask
> resurrects the bug? NIC here never showed any sort of corruption
> described in commit which banned 64-bit DMA.

We had multiple reports of users who encountered repeated memory
corruption when transferring large files while running with a 64-bit DMA
mask. Chris Snook noticed the upper 32 bits of the descriptor address
register are shared among five other registers, each containing the low
bits for one of five descriptors. All the descriptors, therefore, have
to live within the same 4GB address space.

I'll keep poking at it as time permits through the week, but I probably
won't be able to devote a whole lot of time to it until next weekend.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/