Re: [PATCH V3 10/10] x86/pks: Add PKS test code
From: Ira Weiny
Date: Thu Dec 17 2020 - 23:06:12 EST
On Thu, Dec 17, 2020 at 12:55:39PM -0800, Dave Hansen wrote:
> On 11/6/20 3:29 PM, ira.weiny@xxxxxxxxx wrote:
> > + /* Arm for context switch test */
> > + write(fd, "1", 1);
> > +
> > + /* Context switch out... */
> > + sleep(4);
> > +
> > + /* Check msr restored */
> > + write(fd, "2", 1);
>
> These are always tricky. What you ideally want here is:
>
> 1. Switch away from this task to a non-PKS task, or
> 2. Switch from this task to a PKS-using task, but one which has a
> different PKS value
Or both...
>
> then, switch back to this task and make sure PKS maintained its value.
>
> *But*, there's no absolute guarantee that another task will run. It
> would not be totally unreasonable to have the kernel just sit in a loop
> without context switching here if no other tasks can run.
>
> The only way you *know* there is a context switch is by having two tasks
> bound to the same logical CPU and make sure they run one after another.
Ah... We do that.
...
+ CPU_ZERO(&cpuset);
+ CPU_SET(0, &cpuset);
+ /* Two processes run on CPU 0 so that they go through context switch. */
+ sched_setaffinity(getpid(), sizeof(cpu_set_t), &cpuset);
...
I think this should be ensuring that both the parent and the child are
running on CPU 0. At least according to the man page they should be.
<man>
A child created via fork(2) inherits its parent's CPU affinity mask.
</man>
Perhaps a better method would be to synchronize the 2 threads more to ensure
that we are really running at the 'same time' and forcing the context switch.
> This just gets itself into a state where it *CAN* context switch and
> prays that one will happen.
Not sure what you mean by 'This'? Do you mean that running on the same CPU
will sometimes not force a context switch? Or do you mean that the sleeps
could be badly timed and the 2 threads could run 1 after the other on the same
CPU? The latter is AFAICT the most likely case.
>
> You can also run a bunch of these in parallel bound to a single CPU.
> That would also give you higher levels of assurance that *some* context
> switch happens at sleep().
I think more cycles is a good idea for sure. But I'm more comfortable with
forcing the test to be more synchronized so that it is actually running in the
order we think/want it to be.
>
> One critical thing with these tests is to sabotage the kernel and then
> run them and make *sure* they fail. Basically, if you screw up, do they
> actually work to catch it?
I'll try and come up with a more stressful test.
Ira