Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure

From: Peter Zijlstra
Date: Wed Aug 30 2017 - 02:14:03 EST


On Wed, Aug 30, 2017 at 07:19:30AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2017-08-29 at 13:27 +0200, Peter Zijlstra wrote:
> > mpe helped me out and explained that is the PWC hint to TBLIE.
> >
> > So, you set need_flush_all when you unhook pud/pmd/pte which you then
> > use to set PWC. So free_pgtables() will do the PWC when it unhooks
> > higher level pages.
> >
> > But you're right that there's some issues, free_pgtables() itself
> > doesn't seem to use mm->page_table_lock,pmd->lock _AT_ALL_ to unhook the
> > pages.
> >
> > If it were to do that, things should work fine since those locks would
> > then serialize against the speculative faults, we would never install a
> > page if the VMA would be under tear-down and it would thus not be
> > visible to your caches either.
>
> That's one case. I don't remember of *all* the cases to be honest, but
> I do remember several times over the past few years thinking "ah we are
> fine because the mm sem taken for writing protects us from any
> concurrent tree structure change" :-)

Well, installing always seems to use the locks (it needs to, because its
always done with down_read()), that only leaves removal, and the only
place I know that removes stuff is free_pgtables().

But I think I found another fun place, copy_page_range(). While it
(pointlessly) takes all the PTLs on the dst mm it walks the src page
tables without any PTLs.

This means that if we have a multi-threaded process doing fork() a
thread of the src mm could instantiate page-tables that will not be
copied over.

Of course, this is highly dubious behaviour to begin with, and I don't
think there's anything fundamentally wrong with missing those pages but
we should document this stuff.