Re: [PATCH 0/3] TLB flush multiple pages per IPI v5

From: Ingo Molnar
Date: Tue Jun 09 2015 - 06:32:47 EST



* Mel Gorman <mgorman@xxxxxxx> wrote:

> > So have you explored the possibility to significantly simplify your patch-set
> > by only deferring the flushing, and doing a simple TLB flush on the remote
> > CPU?
>
> Yes. At one point I looked at range flushing but it is not a good idea.

My suggestion wasn't range-flushing, but a simple all-or-nothing batched flush of
user-space TLBs.

> The ranges that reach the end of the LRU are too large to be useful except in
> the ideal case of a workload that sequentially accesses memory. Flushing the
> full TLB has an unpredictable cost. [...]

Why would it have unpredictable cost? We flush the TLB on every process context
switch. Yes, it's somewhat workload dependent, but the performance profile is so
different anyway with batching that it has to be re-measured anyway.

> With a full flush we clear entries we know were recently accessed and may have
> to be looked up again and we do this every 32 mapped pages that are reclaimed.
> In the ideal case of a sequential mapped reader it would not matter as the
> entries are not needed so we would not see the cost at all. Other workloads will
> have to do a refill that was not necessary before this series. The cost of the
> refill will depend on the CPU and whether the lookup information is still in the
> CPU cache or not. That means measuring the full impact of your proposal is
> impossible as it depends heavily on the workload, the timing of its interaction
> with kswapd in particular, the state of the CPU cache and the cost of refills
> for the CPU.
>
> I agree with you in that it would be a simplier series and the actual flush
> would probably be faster but the downsides are too unpredictable for a series
> that primarily is about reducing the number of IPIs.

Sorry, I don't buy this, at all.

Please measure this, the code would become a lot simpler, as I'm not convinced
that we need pfn (or struct page) or even range based flushing.

I.e. please first implement the simplest remote batching variant, then complicate
it if the numbers warrant it. Not the other way around. It's not like the VM code
needs the extra complexity!

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/