Re: [patch 1/2] x86, fpu: split FPU state from task struct - v5

From: Suresh Siddha
Date: Tue Mar 11 2008 - 16:21:12 EST


On Tue, Mar 11, 2008 at 08:07:34AM +0300, Alexey Dobriyan wrote:
> On Mon, Mar 10, 2008 at 03:28:04PM -0700, Suresh Siddha wrote:
> > Split the FPU save area from the task struct. This allows easy migration
> > of FPU context, and it's generally cleaner. It also allows the following
> > two optimizations:
> >
> > 1) only allocate when the application actually uses FPU, so in the first
> > lazy FPU trap. This could save memory for non-fpu using apps. Next patch
> > does this lazy allocation.
> >
> > 2) allocate the right size for the actual cpu rather than 512 bytes always.
> > Patches enabling xsave/xrstor support (coming shortly) will take advantage
> > of this.
>
> Ugh, not seeing patch, but judging from description it will make
> "choose wrong CONFIG_M* and fxsave will corrupt random FPU state" situation
> likely?

No. CONFIG_M* doesn't determine the size of the state. Feature information from
the 'cpuid' instruction will dictate the size allocated/used. Anyhow, please
wait for the xsave patches.

>
> > --- linux-2.6-x86.orig/arch/x86/kernel/process_64.c
> > +++ linux-2.6-x86/arch/x86/kernel/process_64.c
> > @@ -634,7 +634,7 @@
> >
> > /* we're going to use this soon, after a few expensive things */
> > if (next_p->fpu_counter>5)
> > - prefetch(&next->i387.fxsave);
> > + prefetch(next->xstate);
>
> Can we please give it better name, like fpu_state? It's a member of
> task_struct after all.

It need not be only FPU. We can have non-math state here aswell.

selected 'xstate' for extended state. I am all open for any reasonable name,
reflecting math, extended math(fsave/fxsave/..) and future math/
non-math extensions.

> > {
> > unsigned long oldcr0 = read_cr0();
> > - extern void __bad_fxsave_alignment(void);
> > -
> > - if (offsetof(struct task_struct, thread.i387.fxsave) & 15)
> > - __bad_fxsave_alignment();
>
> I think removal of such checks needs giving necessary alignment to cache.
> Previously it worked because of __aligned((16)) and L1_CACHE_SHIFT
> combo.

alignment is now specified as part of kmem_cache_create() and checed
in the allocation routines.

thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/