Re: [PATCH v2 2/2] vmalloc: Remove work as from vfree path

From: Andy Lutomirski
Date: Tue May 21 2019 - 13:03:24 EST


On Tue, May 21, 2019 at 9:51 AM Edgecombe, Rick P
<rick.p.edgecombe@xxxxxxxxx> wrote:
>
> On Tue, 2019-05-21 at 09:17 -0700, Andy Lutomirski wrote:
> > On Mon, May 20, 2019 at 4:39 PM Rick Edgecombe
> > <rick.p.edgecombe@xxxxxxxxx> wrote:
> > > From: Rick Edgecombe <redgecombe.lkml@xxxxxxxxx>
> > >
> > > Calling vm_unmap_alias() in vm_remove_mappings() could potentially
> > > be a
> > > lot of work to do on a free operation. Simply flushing the TLB
> > > instead of
> > > the whole vm_unmap_alias() operation makes the frees faster and
> > > pushes
> > > the heavy work to happen on allocation where it would be more
> > > expected.
> > > In addition to the extra work, vm_unmap_alias() takes some locks
> > > including
> > > a long hold of vmap_purge_lock, which will make all other
> > > VM_FLUSH_RESET_PERMS vfrees wait while the purge operation happens.
> > >
> > > Lastly, page_address() can involve locking and lookups on some
> > > configurations, so skip calling this by exiting out early when
> > > !CONFIG_ARCH_HAS_SET_DIRECT_MAP.
> >
> > Hmm. I would have expected that the major cost of vm_unmap_aliases()
> > would be the flush, and at least informing the code that the flush
> > happened seems valuable. So would guess that this patch is actually
> > a
> > loss in throughput.
> >
> You are probably right about the flush taking the longest. The original
> idea of using it was exactly to improve throughput by saving a flush.
> However with vm_unmap_aliases() the flush will be over a larger range
> than before for most arch's since it will likley span from the module
> space to vmalloc. From poking around the sparc tlb flush history, I
> guess the lazy purges used to be (still are?) a problem for them
> because it would try to flush each page individually for some CPUs. Not
> sure about all of the other architectures, but for any implementation
> like that, using vm_unmap_alias() would turn an occasional long
> operation into a more frequent one.
>
> On x86, it shouldn't be a problem to use it. We already used to call
> this function several times around a exec permission vfree.
>
> I guess its a tradeoff that depends on how fast large range TLB flushes
> usually are compared to small ones. I am ok dropping it, if it doesn't
> seem worth it.

On x86, a full flush is probably not much slower than just flushing a
page or two -- the main cost is in the TLB refill. I don't know about
other architectures. I would drop this patch unless you have numbers
suggesting that it's a win.