Re: [PATCH 4/8] x86/fpu: Remove the thread::fpu pointer

From: Ingo Molnar
Date: Thu Apr 10 2025 - 06:15:44 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Wed, Apr 09, 2025 at 11:11:23PM +0200, Ingo Molnar wrote:
>
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 5ea7e5d2c4de..b7f7c9c83409 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -514,12 +514,9 @@ struct thread_struct {
> >
> > struct thread_shstk shstk;
> > #endif
> > -
> > - /* Floating point and extended processor state */
> > - struct fpu *fpu;
> > };
> >
> > -#define x86_task_fpu(task) ((task)->thread.fpu)
> > +#define x86_task_fpu(task) ((struct fpu *)((void *)(task) + sizeof(*(task))))
>
> Doesn't our FPU state need to be cacheline aligned?

Yeah, and we do have a check for that:

+ BUILD_BUG_ON(sizeof(*dst) % SMP_CACHE_BYTES != 0);

And task_struct is allocated cache-aligned, which means when we do this
in fpu_clone():

+ struct fpu *dst_fpu = (void *)dst + sizeof(*dst);

the FPU pointer is guaranteed to be cacheline aligned as well.

'dst' in that context is the new task_struct.

BTW., Oleg suggested in a previous discussion for us to replace the
task->thread.fpu pointer with a build-time calculation - but I'm still
not sure it's a good idea.

Thanks,

Ingo