Re: [RFCv2 03/10] x86: Introduce userspace API to handle per-thread features

From: Thomas Gleixner
Date: Sun May 15 2022 - 05:06:08 EST


On Sat, May 14 2022 at 23:06, Edgecombe, Rick P wrote:
> On Sat, 2022-05-14 at 10:37 +0200, Thomas Gleixner wrote:
>> > "On success, arch_prctl() returns positive values; on error, -1 is
>> > returned, and errno is set to indicate the error."
>>
>> Why?
>>
>> prctl(GET, &out)
>>
>> is the pattern used all over the place.
>
> It seems better to me, but we also need to pass something in.
>
> The idea of the "enable" operation is that userspace would pass in all
> the features that it wants in one call, and then find out back what was
> successfully enabled. So unlike the other arch_prctl()s, it wants to
> pass something in AND get a result in one arch_prctl() call. It doesn't
> need to check what is supported ahead of time. Since these enabling
> operations can fail (OOM, etc), userspace has to handle unexpected per-
> feature failure anyway. So it just blindly asks for what it wants.

I'm not convinced at all, that this wholesale enabling of independent
features makes any sense. That would require:

- that all features are strict binary of/off which is alredy not the
case with LAM due to the different mask sizes.

- that user space knows at some potentially early point of process
startup which features it needs. Some of them might be requested
later when libraries are loaded, but that would be too late for
others.

Aside of that, if you lump all these things together, what's the
semantics of this feature lump in case of failure:

Try to enable the rest when one fails, skip the rest, roll back?

Also such a feature lump results in a demultiplexing prctl() which is
code wise always inferior to dedicated controls.

> Any objections to having it write back the result in the same
> structure?

Why not.

> Otherwise, the option that used to be used here was a "status"
> arch_prctl(), which was called separately to find out what actually got
> enabled after an "enable" call. That fit with the GET/SET semantics
> already in place.
>
> I guess we could also get rid of the batch enabling idea, and just have
> one "enable" call per feature too. But then it is syscall heavy.

This is not a runtime hotpath problem. Those prctls() happen once when
the process starts, so having three which are designed for the
individual purpose instead of one ill defined is definitely the better
choice.

Premature optimization is never a good idea. Keep it simple is the right
starting point.

If it really turns out to be something which matters, then you can
provide a batch interface later on if it makes sense to do so, but see
above.

Thanks,

tglx