Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

From: Jens Axboe
Date: Tue Jul 24 2007 - 04:22:01 EST


On Tue, Jul 24 2007, Jens Axboe wrote:
> On Mon, Jul 23 2007, Andrew Morton wrote:
> > I worked out that the crash I saw was in
> >
> > BUG_ON(!pte_none(*(kmap_pte-idx)));
> >
> > in the read of kmap_pte[idx]. Which would be weird as the caller is using
> > a literal KM_USER0.
> >
> > So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am
> > unable to reproduce it now).
> >
> > If that BUG_ON _is_ triggering then it might indicate that someone is doing
> > a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.
>
> Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the
> address. I've seen both of those end up triggering that BUG_ON() in a
> later kmap.
>
> Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in
> ocfs2 at least. But you are probably not using that, so I'll keep
> looking...

What about the new async crypto stuff? I've been looking, but is it
guarenteed that async_memcpy() runs in process context with interrupts
enabled always? If not, there's a km type bug there.

In general, I think the highmem stuff could do with more safety checks:

- People ALWAYS get the atomic unmaps wrong, passing in the page instead
of the address. I've seen tons of these. And since kunmap_atomic()
takes a void pointer, nobody notices until it goes boom.
- People easily get the km type wrong - they use KM_USERx in interrupt
context, or one of the irq variants without disabling interrupts.

If we could just catch these two types of bugs, we've got a lot of these
problems covered.


--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/