Re: [RFC patch] introduce sys_membarrier(): process-wide memorybarrier (v9)

From: Mathieu Desnoyers
Date: Thu Feb 25 2010 - 12:51:30 EST


* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
> On Thu, 2010-02-25 at 11:53 -0500, Mathieu Desnoyers wrote:
>
> > > It would be very trivial compared to the process-private case. Just IPI
> > > all CPUs. It would allow older kernels to work with newer process based
> > > apps as they get implemented. But... not a really big deal I suppose.
> >
> > This is actually what I did in v1 of the patch, but this implementation met
> > resistance from the RT people, who were concerned about the impact on RT tasks
> > of a lower priority process doing lots of sys_membarrier() calls. So if we want
> > to do other-process-aware sys_membarrier(), we would have to iterate on all
> > cpus, for every running process shared memory maps and see if there is something
> > shared with all shm of the current process. This is clearly not as trivial as
> > just broadcasting the IPI to all cpus.
>
> Right, it may require another syscall or parameter to let the tasks
> register a shared page. Then have some mechanism to find a way to
> quickly check if a CPU is running a process with that page.

Well, either we explicitly require the task to register its shared pages, which
could be error-prone in terms of API, or simply consider all pages that are
shared between the current process and every process running on other CPUs. That
would be much simpler to use from a user-level perspective I think. The
downside is that it may generate a few IPIs to processes that happen not to need
them, but we are talking of a relatively small overhead to processes that we are
interacting with anyway. It's not like we would be interrupting completely
unrelated RT threads. I'm just not sure if it would be valid to exclude COW and
RO shared pages from that check. For instance, if a pages is mapped as RO on one
process and RW on another, then we have to synchronize these processes. Similar
weird cases could happen if a memory map is changed from RW to RO right after
the content is modified, and then we need to execute sys_membarrier: we might
miss a memory map that actually needs to be synchronized.

And yes, as you say, we'd have to find a way to quickly compare shared-memory
maps from two processes. The dumb approach, O(n^2), would be to compare these
entries element by element. Assuming a relatively low amount of shared mmaps,
this could make sense, otherwise we'd have to construct a lookup hash table to
accelerate the lookup, but it adds either a basic runtime overhead if we
construct it within sys_membarrier() or a memory overhead if we choose to add it
to the task struct (which I'd really like to avoid).

But... either way we chose, we can extend the system call flags and parameters
as needed, so I think it really should not be part of this initial
implementation.

Thanks,

Mathieu

>
> -- Steve
>
>

--
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/