Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct

From: Linus Torvalds
Date: Wed Mar 02 2011 - 14:27:14 EST


On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> In order to be able to properly support architecture that want/need to
> support TLB range invalidation, we need to change the
> flush_tlb_range() argument from a vm_area_struct to an mm_struct
> because the range might very well extend past one VMA, or not have a
> VMA at all.

I really don't think this is right. The whole "drop the icache
information" thing is a total anti-optimization, since for some
architectures, the icache flush is the _big_ deal. Possibly much
bigger than the TLB flush itself. Doing an icache flush was much more
expensive than the TLB flush on alpha, for example (the tlb had ASI's
etc, the icache did not).

> There are various reasons that we need to flush TLBs _after_ freeing
> the page-tables themselves. For some architectures (x86 among others)
> this serializes against (both hardware and software) page table
> walkers like gup_fast().

This part of the changelog also makes no sense what-so-ever. It's
actively wrong.

On x86, we absolutely *must* do the TLB flush _before_ we release the
page tables. So your commentary is actively wrong and misleading.

The order has to be:
- clear the page table entry, queue the page to be free'd
- flush the TLB
- free the page (and page tables)

and nothing else is correct, afaik. So the changelog is pure and utter
garbage. I didn't look at what the patch actually changed.

NAK.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/