Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

From: Uros Bizjak
Date: Wed Oct 18 2023 - 08:15:06 EST


On Wed, Oct 18, 2023 at 12:54 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>
>
>
> > On Oct 18, 2023, at 12:04 PM, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > Solved.
> >
> > All that is needed is to patch cpu_init() from
> > arch/x86/kernel/cpu/common.c with:
> >
> > --cut here--
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index b14fc8c1c953..61b6fcdf6937 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -2228,7 +2232,7 @@ void cpu_init_exception_handling(void)
> > */
> > void cpu_init(void)
> > {
> > - struct task_struct *cur = current;
> > + struct task_struct *cur = this_cpu_read_stable(pcpu_hot.current_task);
> > int cpu = raw_smp_processor_id();
>
> Thanks for solving that, and sorry that I missed it.
>
> The reason I didn’t encounter it before is that in my original patch I created
> a new compilation unit which only defined the alias.
>
> Since there might be additional problems (any “current” use in common.c is
> dangerous, even in included files), I think that while there may be additional
> solutions, defining the alias in a separate compilation unit - as I did before -
> is the safest.

What happens here can be illustrated with the following testcase:

--cut here--
int init_mm;

struct task_struct
{
int *active_mm;
};

struct task_struct init_task;

struct pcpu_hot
{
struct task_struct *current_task;
};

struct pcpu_hot pcpu_hot = { .current_task = &init_task };

extern const struct pcpu_hot __seg_gs const_pcpu_hot
__attribute__((alias("pcpu_hot")));

void foo (void)
{
struct task_struct *cur = const_pcpu_hot.current_task;

cur->active_mm = &init_mm;
}
--cut here--

gcc -O2 -S:

foo:
movq $init_mm, init_task(%rip)
ret

Here, gcc optimizes the access to generic address space, which is
allowed to, since *we set the alias to pcpu_hot*, which is in the
generic address space. The compiler doesn't care that we actually
want:

foo:
movq %gs:const_pcpu_hot(%rip), %rax
movq $init_mm, (%rax)

So yes, to prevent the optimization, we have to hide the alias in another TU.

BTW: Clang creates:

foo:
movq %gs:pcpu_hot(%rip), %rax
movq $init_mm, (%rax)
retq

It is a bit more conservative and retains the address space of the
aliasing symbol.

Looks like another case of underspecified functionality where both
compilers differ. Luckily, both DTRT when aliases are hidden in
another TU.

Uros.