Re: [RFC PATCH] mm, oom_reaper: gather each vma to prevent leaking TLB entry

From: Will Deacon
Date: Mon Nov 06 2017 - 19:54:31 EST


On Mon, Nov 06, 2017 at 01:27:26PM +0100, Michal Hocko wrote:
> On Mon 06-11-17 09:52:51, Michal Hocko wrote:
> > On Mon 06-11-17 15:04:40, Bob Liu wrote:
> > > On Mon, Nov 6, 2017 at 11:36 AM, Wang Nan <wangnan0@xxxxxxxxxx> wrote:
> > > > tlb_gather_mmu(&tlb, mm, 0, -1) means gathering all virtual memory space.
> > > > In this case, tlb->fullmm is true. Some archs like arm64 doesn't flush
> > > > TLB when tlb->fullmm is true:
> > > >
> > > > commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1").
> > > >
> > >
> > > CC'ed Will Deacon.
> > >
> > > > Which makes leaking of tlb entries. For example, when oom_reaper
> > > > selects a task and reaps its virtual memory space, another thread
> > > > in this task group may still running on another core and access
> > > > these already freed memory through tlb entries.
> >
> > No threads should be running in userspace by the time the reaper gets to
> > unmap their address space. So the only potential case is they are
> > accessing the user memory from the kernel when we should fault and we
> > have MMF_UNSTABLE to cause a SIGBUS.
>
> I hope we have clarified that the tasks are not running in userspace at
> the time of reaping. I am still wondering whether this is real from the
> kernel space via copy_{from,to}_user. Is it possible we won't fault?
> I am not sure I understand what "Given that the ASID allocator will
> never re-allocate a dirty ASID" means exactly. Will, could you clarify
> please?

Sure. Basically, we tag each address space with an ASID (PCID on x86) which
is resident in the TLB. This means we can elide TLB invalidation when
pulling down a full mm because we won't ever assign that ASID to another mm
without doing TLB invalidation elsewhere (which actually just nukes the
whole TLB).

I think that means that we could potentially not fault on a kernel uaccess,
because we could hit in the TLB. Perhaps a fix would be to set the force
variable in tlb_finish_mmu if MMF_UNSTABLE is set on the mm?

Will