Re: [RFC PATCH 0/3] Implement getcpu_cache system call

From: Josh Triplett
Date: Mon Jan 11 2016 - 21:46:09 EST


On Tue, Jan 12, 2016 at 12:49:18AM +0000, Mathieu Desnoyers wrote:
> ----- On Jan 11, 2016, at 6:03 PM, Josh Triplett josh@xxxxxxxxxxxxxxxx wrote:
>
> > On Mon, Jan 11, 2016 at 10:38:28PM +0000, Seymour, Shane M wrote:
> >> I have some concerns and suggestions for you about this.
> >>
> >> What's to stop someone in user space from requesting an arbitrarily large number
> >> of CPU # cache locations that the kernel needs to allocate memory to track and
> >> each time the task migrates to a new CPU it needs to update them all? Could you
> >> use it to dramatically slow down a system/task switching? Should there be a
> >> ulimit type value or a sysctl setting to limit the number that you're allowed
> >> to register per-task?
> >
> > The documented behavior of the syscall allows only one location per
> > thread, so the kernel can track that one and only address rather easily
> > in the task_struct. Allowing dynamic allocation definitely doesn't seem
> > like a good idea.
>
> The current implementation now allows more than one location per
> thread. Which piece of documentation states that only one location
> per thread is allowed ? This was indeed the case for the prior
> implementations, but I moved to implementing a linked-list of
> cpu_cache areas per thread to allow the getcpu_cache system call to
> be used by more than a single shared object within a given program.

Ah, I missed that change.

> Without the linked list, as soon as more than one shared object try
> to register their cache, the first one will prohibit all others from
> doing so.
>
> We could perhaps try to document that this system call should only
> ever be used by *libc, and all libraries and applications should
> then use the libc TLS cache variable, but it seems rather fragile,
> and any app/lib could try to register its own cache.

That does seem a bit fragile, true; on the other hand, the linked-list
approach would allow userspace to allocate an unbounded amount of kernel
memory, without any particular control on it. That doesn't seem
reasonable. Introducing an rlimit or similar for this seems like
massive overkill, and hardcoding a fixed limit breaks the 0-1-infinity
rule.

Given that any registered location will always provide the same value,
allowing only a single registration doesn't seem *too* problematic;
libc-based programs can use the libc implementation, and non-libc-based
programs can register a location themselves. And users of this API will
already likely want to use some TLS mechanism, which already interacts
heavily with libc (set_thread_area/clone).

Allowing only one registration at a time seems preferable to introducing
another way to allocate kernel resources on a process's behalf.

- Josh Triplett