Re: Oops in a driver while using SLUB as a SLAB allocator

From: Hugh Dickins
Date: Mon Jun 25 2007 - 13:24:34 EST


On Mon, 25 Jun 2007, Christoph Lameter wrote:
> On Mon, 25 Jun 2007, Hugh Dickins wrote:
>
> > > In many situations the page struct passed to flush_dcache_page is
> > > simply used to calculate the virtual address. So its mostly harmless.
> > > Trouble starts when page attributes like the mapping is used.
> >
> > Mostly harmless indeed. I don't understand why you insist on trying
> > to complicate the situation. flush_dcache_page is only expected to
> > do something on pages mapped into userspace (correct me if I'm wrong
> > there), it's expected to do nothing on kmalloc'ed pages. It's
> > been working that way for years, and will continue to work that way
> > with slub, providing either page_mapping or flush_dcache_page checks
> > PageSlab to avoid oopsing on page->mapping.
>
> It is definitely intended to work. Otherwise we would not have code
> like this:
>
> christoph@fly:~/linux-2.6$ find . -name "*.c" | xargs grep "flush_dcache_page"|grep virt
> ./drivers/scsi/scsi_tgt_if.c: flush_dcache_page(virt_to_page(ev));
> ./drivers/scsi/scsi_tgt_if.c: flush_dcache_page(virt_to_page(ev));

I didn't claim that flush_dcache_page(virt_to_page(virt)) is not expected
to work. I claim that flush_dcache_page is expected to be a noop rather
than an oops on a kmalloced page.

> > 2.6.22-rc6 has page_mapping making that check: we could argue about
> > which is the better site for it, there are good arguments both ways
> > (page_mapping is the correct place, flush_dcache_page is the more
> > efficient place), I suggest we leave it as is.
>
> Ok. I think your patch is fine as a quick fix for 2.6.22. I am a bit
> uneasy with that given that its in such a broadly used function while its
> only use is to enable flush_dcache_page to work. But we need the general
> issue taken care of after 2.6.22.

What general issue?

>
> > > A kmalloc slab object (even 64 byte) may be crossing a page boundary
> > > with a ARCH_KMALLOC_MINALIGN of 4 or 8. So I think that
> > > flush_dcache_range *must* be used rather than flush_dcache_page.
> >
> > Why???? All we require of flush_dcache_page is that it not oops on
> > the first page in the range: we don't need to change over to
> > flush_dcache_range for that.
>
> As explained about: There are corner cases in which it does not work. You
> seem to assume that flush_dcache_page can become a no op. That may not be
> true on platforms that need explicit cache flushing for a DMA engine to
> access a data structure. The above listed use suggests that the caller
> expects flushing to occur correctly.

The scsi_tgt_if.c use you show above? That's not dealing with
kmalloced pages, is it? True, the pages it is dealing with don't
have page->mapping set, so those architectures which use page->mapping
to find what to do in their flush_dcache_page, won't do anything there
in their flush_dcache_page. Whether that's a bug or not, I wouldn't
pretend to know; but it's nothing to do with the present discussion.

Please see Documentation/cachetlb.txt: flush_dcache_page is about
pagecache pages mapped into userspace. We don't use kmalloc for those,
but we do sometimes need to flush_dcache_page in places which commonly
deal with pagecache pages, but sometimes handle kmalloc'ed buffers too.
Luckily we don't have to deal with buffers in which the first page is
kmalloced and the next comes from pagecache.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/