Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages
From: Andy Lutomirski
Date: Tue Dec 04 2018 - 15:10:07 EST
On Tue, Dec 4, 2018 at 12:02 PM Edgecombe, Rick P
<rick.p.edgecombe@xxxxxxxxx> wrote:
>
> On Tue, 2018-12-04 at 16:03 +0000, Will Deacon wrote:
> > On Mon, Dec 03, 2018 at 05:43:11PM -0800, Nadav Amit wrote:
> > > > On Nov 27, 2018, at 4:07 PM, Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > > wrote:
> > > >
> > > > Since vfree will lazily flush the TLB, but not lazily free the underlying
> > > > pages,
> > > > it often leaves stale TLB entries to freed pages that could get re-used.
> > > > This is
> > > > undesirable for cases where the memory being freed has special permissions
> > > > such
> > > > as executable.
> > >
> > > So I am trying to finish my patch-set for preventing transient W+X mappings
> > > from taking space, by handling kprobes & ftrace that I missed (thanks again
> > > for
> > > pointing it out).
> > >
> > > But all of the sudden, I donât understand why we have the problem that this
> > > (your) patch-set deals with at all. We already change the mappings to make
> > > the memory writable before freeing the memory, so why canât we make it
> > > non-executable at the same time? Actually, why do we make the module memory,
> > > including its data executable before freeing it???
> >
> > Yeah, this is really confusing, but I have a suspicion it's a combination
> > of the various different configurations and hysterical raisins. We can't
> > rely on module_alloc() allocating from the vmalloc area (see nios2) nor
> > can we rely on disable_ro_nx() being available at build time.
> >
> > If we *could* rely on module allocations always using vmalloc(), then
> > we could pass in Rick's new flag and drop disable_ro_nx() altogether
> > afaict -- who cares about the memory attributes of a mapping that's about
> > to disappear anyway?
> >
> > Is it just nios2 that does something different?
> >
> > Will
>
> Yea it is really intertwined. I think for x86, set_memory_nx everywhere would
> solve it as well, in fact that was what I first thought the solution should be
> until this was suggested. It's interesting that from the other thread Masami
> Hiramatsu referenced, set_memory_nx was suggested last year and would have
> inadvertently blocked this on x86. But, on the other architectures I have since
> learned it is a bit different.
>
> It looks like actually most arch's don't re-define set_memory_*, and so all of
> the frob_* functions are actually just noops. In which case allocating RWX is
> needed to make it work at all, because that is what the allocation is going to
> stay at. So in these archs, set_memory_nx won't solve it because it will do
> nothing.
>
> On x86 I think you cannot get rid of disable_ro_nx fully because there is the
> changing of the permissions on the directmap as well. You don't want some other
> caller getting a page that was left RO when freed and then trying to write to
> it, if I understand this.
>
Exactly.
After slightly more thought, I suggest renaming VM_IMMEDIATE_UNMAP to
VM_MAY_ADJUST_PERMS or similar. It would have the semantics you want,
but it would also call some arch hooks to put back the direct map
permissions before the flush. Does that seem reasonable? It would
need to be hooked up that implement set_memory_ro(), but that should
be quite easy. If nothing else, it could fall back to set_memory_ro()
in the absence of a better implementation.