Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in

From: Nick Desaulniers
Date: Tue Mar 07 2023 - 16:58:57 EST


On Tue, Mar 7, 2023 at 1:35 PM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> > Ok, so here's a *ttoally* untested and mindless patch to maybe fix
> > what I dislike about that resctl code.
> >
> > Does it fix the code generation issue? I have no idea. But this is
> > what I would suggest is the right answer, without actually knowing the
> > code any better, and just going on a mindless rampage.
> >
> > It seems to compile for me, fwiw.
>
> Beyond compiling it boots and passes the tools/testing/selftests/resctrl test suite.
>
> Tested-by: Tony Luck <tony.luck@xxxxxxxxx>

LGTM; reloading of current becomes irrelevant now that we're reusing
the existing variable `next_p`.

Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>

Might be nice to tag this for stable.

Cc: <stable@xxxxxxxxxxxxxxx>

And credit Stephane who did a nice job tracking this down and having a
concise reproducer.

Reported-by: Stephane Eranian <eranian@xxxxxxxxxx>

Perhaps relevant links

Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@xxxxxxxxxx/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/

Consider reusing parts of Stephane's message from the initial Link above?

Thanks for the patch.

---

While this issue was specific to the usage of `current` (implemented
as a macro that uses `this_cpu_read_stable`, I wonder if we might hit
issues again in the future (at least on x86 using the "p" constraint)
in code that:
1. uses this_cpu_read_stable to access a per cpu variable
2. updates that per cpu variable
3. uses this_cpu_read_stable to access the variable hoping to get the
new value rather than the old.

I guess that this_cpu_read should be used rather than
this_cpu_read_stable? Maybe we can beef up the comment in
arch/x86/include/asm/percpu.h to warn about this? The sentence about
get_thread_info() being a user of this_cpu_read_stable() seems
outdated/due for a refresh?

Is __switch_to the only place that should be updating current? Are
there other variables other than current that might be afflicted by
that 1,2,3 pattern I mention above?

current_top_of_stack() uses this_cpu_read_stable() for instance.
Perhaps there could be a caller that measures the stack depth, grows
the stack, then rereads the wrong value?
--
Thanks,
~Nick Desaulniers