Re: [PATCHSET] block: fix PIO cache coherency bug

From: Russell King
Date: Fri Jan 13 2006 - 13:19:38 EST


On Fri, Jan 13, 2006 at 09:50:19AM -0600, James Bottomley wrote:
> Actually, this doesn't look to be the correct thing to do. The
> dma_map/unmap don't make the data coherent with respect to the user
> space, only with respect to the kernel space. I've never liked this
> (and indeed I wrote an OLS paper in 2004 trying to explain how we could
> fix it) but that's our current model.
>
> Our classic path for data on machines is that the driver makes the
> kernel coherent and then whatever's transferring from the page cache to
> the user makes user space coherent. It sounds, therefore, like
> whatever's broken (what is the problem, by the way?) is broken in the
> second half (page cache to user) not in the first half (driver to
> kernel).

I think you're misunderstanding the issue. I'll give you essentially
my understanding of the explaination that Dave Miller gave me a number
of years ago. This is from memory, so Dave may wish to correct it.

1. When a driver DMAs data into a page cache page, it is written directly
to RAM and is made visible to the kernel mapping via the DMA API. As
a result, there will be no cache lines associated with the kernel
mapping at the point when the driver hands the page back to the page
cache.

However, in the PIO case, there is the possibility that the data read
from the device into the kernel mapping results in cache lines
associated with the page. Moreover, if the cache is write-allocate,
you _will_ have cache lines.

Therefore, you have two completely differing system states depending
on how the driver decided to transfer data from the device to the page
cache.

As such, drivers must ensure that PIO data transfers have the same
system state guarantees as DMA data transfers.

ISTR davem recommended flush_dcache_page() be used for this.

2. (this is my own) The cachetlb document specifies quite clearly what
is required whenever a page cache page is written to - that is
flush_dcache_page() is called. The situation when a driver uses PIO
quote clearly violates the requirements set out in that document.

>From (2), it is quite clear that flush_dcache_page() is the correct
function to use, otherwise we would end up with random set of state
of pages in the page cache. (1) merely reinforces that it's the
correct place for the decision to be made. In fact, it's the only
part of the kernel which _knows_ what needs to be done.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/