Re: [PATCH] prevent AoE causing cache aliases

From: Peter Horton
Date: Wed Nov 04 2009 - 10:52:00 EST


Andrew Morton wrote:
On Wed, 04 Nov 2009 10:54:34 +0000 Peter Horton <phorton@xxxxxxxxxxxx> wrote:

Andrew Morton wrote:
On Thu, 22 Oct 2009 15:22:28 +0100
phorton@xxxxxxxxxxxx (Peter Horton) wrote:

To: ecashin@xxxxxxxxxx
Have you heard back from Ed on this?

No.

Cc: linux-kernel@xxxxxxxxxxxxxxx
Subject: [PATCH] prevent AoE causing cache aliases
Date: Thu, 22 Oct 2009 15:22:28 +0100
Sender: linux-kernel-owner@xxxxxxxxxxxxxxx
User-Agent: Mutt/1.5.9i

This patch prevents the AoE block driver from creating cache aliases of
page cache pages on machines with virtually indexed caches.

Building kernels on an AT91SAM9G20 board without this patch fails with
segmentation faults after a couple of passes.


Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
===================================================================
--- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
+++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
@@ -735,6 +735,21 @@
part_stat_unlock();
}
+/*
+ * Ensure we don't create aliases in VI caches
+ */
+static inline void
+killalias(struct bio *bio)
+{
+ struct bio_vec *bv;
+ int i;
+
+ if (bio_data_dir(bio) == READ)
+ __bio_for_each_segment(bv, bio, i, 0) {
+ flush_dcache_page(bv->bv_page);
+ }
+}
+
void
aoecmd_ata_rsp(struct sk_buff *skb)
{
@@ -853,8 +868,12 @@
if (buf && --buf->nframesout == 0 && buf->resid == 0) {
diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
- n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
- bio_endio(buf->bio, n);
+ if (buf->flags & BUFFL_FAIL)
+ bio_endio(buf->bio, -EIO);
+ else {
+ killalias(buf->bio);
+ bio_endio(buf->bio, 0);
+ }
mempool_free(buf, d->bufpool);
}
Looks OK.

This bugfix will cause a pointless __bio_for_each_segment() busywait
loop to be executed on architectures for which flush_dcache_page() is a
no-op.

We don't have infrastructure to fix that.
Couldn't we add a flag to the bio that users could set to indicate that they are not house trained with respect to the D-cache (i.e non-DMA drivers). Architectures that needed to could then flush the relevant pages in the bio_endio() path somewhere. At the moment all the non-DMA block drivers need to be aware of the cache aliasing issue which means this problem keeps arising ...


Could. We'll need to change each arch _somehow_. Even if it's a
matter of adding `#define i_am_not_house_trained' to the troublesome
ones or something, then ifdeffing existing code.

I was thinking that a general bio_flush_dcache_pages() in block core
(or in each arch) would be a suitable way to handle this but I was
unable to find other drivers which needed it after a brief search.



IDE does it at a lower level (arch/mips/include/asm/mach-generic/ide.h for example).

Looks like the generic PIO ops in drivers/ata/libata-sff.c could cause problems too.

I think the problem is often masked by the small cache sizes on the platforms with VI caches. The original AoE problem I see is on an ARM926 with 32K D-cache, I don't see the problem at all on another ARM926 with 16K D-cache.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/