Re: crazy idea: big percpu lock (Re: task isolation)
From: Andy Lutomirski
Date: Fri Oct 09 2015 - 14:56:38 EST
On Fri, Oct 9, 2015 at 2:27 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Thu, 8 Oct 2015, Andy Lutomirski wrote:
>> I want to propose a new primitive that might go a long way toward
>> solving this issue. The new primitive would be called the "big percpu
>> lock".
>
> It took us 15+ years to get rid of the "Big Kernel Lock", so we really
> don't want to add a new "Big XXX Lock". We have enough pain already
> with preempt_disable() and local_irq_disable() which are basically
> "Big CPU Locks".
>
> Don't ever put BIG and LOCK into one context, really.
I knew I shouldn't have called it that. The basic useful idea (if
it's actually useful) is to have an efficient way to poke another
CPU's percpu data structures in cases where we're reasonably confident
that locality doesn't matter. And maybe even doing anything lock-like
is a bad idea for the problems I'm trying to help solve.
lru_add_drain_all is an example. That function already effectively
takes a massive lock. It reads (racily, but it doesn't matter) remote
percpu data (which already forces the cachelines to become shared) and
then, if needed, it schedules work on that CPU and waits for it. It
seems like a very one-sided lock (no overhead on the victim CPU),
except that it requires that the victim schedule in order to make
forward progress. There may even be a forced IPI in there, although I
haven't dug far enough to find it.
Ideally, under memory pressure, we'd have a way to just grab the
remote LRU list directly. We could easily have coarse-grained or
fine-grained locking to enable that, but it might be better to have
just one instance of the resulting barriers on user and idle entry and
exit rather than doing it on every LRU access.
flush_tlb_kernel_range and such are also examples, and they will
currently kill isolation, but maybe we should just have a way to mark
the kernel TLB as idle when we enter user mode and have a way to
recognize that we need a flush when we go back to kernel (or maybe
even NMI) mode.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/