[PATCH] x86/fpu: Clarify FPU context cacheline alignment
From: Ingo Molnar
Date: Thu Apr 10 2025 - 06:55:13 EST
* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, Apr 10, 2025 at 12:10:56PM +0200, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > > On Wed, Apr 09, 2025 at 11:11:23PM +0200, Ingo Molnar wrote:
> > >
> > > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > > > index 5ea7e5d2c4de..b7f7c9c83409 100644
> > > > --- a/arch/x86/include/asm/processor.h
> > > > +++ b/arch/x86/include/asm/processor.h
> > > > @@ -514,12 +514,9 @@ struct thread_struct {
> > > >
> > > > struct thread_shstk shstk;
> > > > #endif
> > > > -
> > > > - /* Floating point and extended processor state */
> > > > - struct fpu *fpu;
> > > > };
> > > >
> > > > -#define x86_task_fpu(task) ((task)->thread.fpu)
> > > > +#define x86_task_fpu(task) ((struct fpu *)((void *)(task) + sizeof(*(task))))
> > >
> > > Doesn't our FPU state need to be cacheline aligned?
> >
> > Yeah, and we do have a check for that:
> >
> > + BUILD_BUG_ON(sizeof(*dst) % SMP_CACHE_BYTES != 0);
>
> Ah, missed that. Clearly I need to improve my reading skillz :-)
Admittedly it's written a bit obtusely - how about the patch below?
Thanks,
Ingo
============================>
From: Ingo Molnar <mingo@xxxxxxxxxx>
Date: Thu, 10 Apr 2025 12:52:16 +0200
Subject: [PATCH] x86/fpu: Clarify FPU context cacheline alignment
Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Fenghua Yu <fenghua.yu@xxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Uros Bizjak <ubizjak@xxxxxxxxx>
---
arch/x86/kernel/fpu/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d0a45f6492cb..3a19877a314e 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -607,7 +607,8 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal,
* We allocate the new FPU structure right after the end of the task struct.
* task allocation size already took this into account.
*
- * This is safe because task_struct size is a multiple of cacheline size.
+ * This is safe because task_struct size is a multiple of cacheline size,
+ * thus x86_task_fpu() will always be cacheline aligned as well.
*/
struct fpu *dst_fpu = (void *)dst + sizeof(*dst);