Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry

From: Rik van Riel
Date: Fri May 01 2015 - 15:12:29 EST


On 05/01/2015 02:40 PM, Ingo Molnar wrote:

> Or we could do that in the syscall path with a single store of a
> constant flag to a location in the task struct. We have a number of
> natural flags that get written on syscall entry, such as:
>
> pushq_cfi $__USER_DS /* pt_regs->ss */
>
> That goes to a constant location on the kernel stack. On return from
> system calls we could write 0 to that location.
>
> So the remote CPU would have to do a read of this location. There are
> two cases:
>
> - If it's 0, then it has observed quiescent state on that CPU. (It
> does not have to be atomics anymore, as we'd only observe the value
> and MESI coherency takes care of it.)

That should do the trick.

> - If it's not 0 then the remote CPU is not executing user-space code
> and we can install (remotely) a TIF_NOHZ flag in it and expect it
> to process it either on return to user-space or on a context
> switch.

I may have to think about this a little more, but
it seems like it should work.

Can we use a separate byte in the flags word for
flags that can get set remotely, so we can do
stores and clearing of local-only flags without
atomic instructions?

> This way, unless I'm missing something, reduces the overhead to a
> single store to a hot cacheline on return-to-userspace - which
> instruction if we place it well might as well be close to zero cost.
> No syscall entry cost. Slow-return cost only in the (rare) case of
> someone using synchronize_rcu().

I think that should take care of the RCU aspect of
nohz_full.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/