Re: [GIT PULL] x86 fixes

From: Ingo Molnar
Date: Wed Aug 19 2015 - 01:59:41 EST



* H. Peter Anvin <hpa@xxxxxxxxx> wrote:

> On 08/17/2015 10:17 AM, Linus Torvalds wrote:
> > On Mon, Aug 17, 2015 at 9:58 AM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
> >>
> >> That is not true. It *does* work, and I have tested it fairly recently.
> >
> > Ok, so it's not too badly broken. Good.
> >
> > Also, while it's been a long time since we needed FPU emulation on the
> > i486sx, I don't recall the details of any of the (much more modern)
> > IoT small cores. I *think* the base platforms are all at a Pentium
> > level (ie not just FPU, but MMX), but maybe there's some reason to
> > keep FP emulation alive for some platforms.
> >
>
> I just went back and looked at my records... I can guarantee that it
> worked as of 743aa456c1834f76982af44e8b71d1a0b2a82e21.

So I went and built 743aa456c1834f76 with ARCH=i386 defconfig +MATH_EMULATION=y
and booted it on real hardware with and without 'no387':

- 743aa456c1834f76: boots fine to a generic distro
- 743aa456c1834f76 + no387: early crash

the early crash is similar to what I saw when doing the recent FPU changes (and
which crash I fixed):

[ 0.000000] Linux version 3.7.0+ (mingo@fomalhaut) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #284 SMP Wed Aug 19 07:51:05 CEST 2015
...
[ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] __ex_table already sorted, skipping sort
[ 0.000000] Initializing CPU#0
[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
...
[ 0.000000] EIP is at kmem_cache_alloc+0x25/0x110
[ 0.000000] Call Trace:
[ 0.000000] [<c048a1fc>] ? sprintf+0x1c/0x20
[ 0.000000] [<c020a11f>] ? init_fpu+0x7f/0xc0
[ 0.000000] [<c0235aef>] ? print_prefix+0xcf/0x130
[ 0.000000] [<c08b0d20>] ? do_debug+0x160/0x160
[ 0.000000] [<c020a11f>] init_fpu+0x7f/0xc0
[ 0.000000] [<c0730df5>] math_emulate+0x6b5/0xc90
[ 0.000000] [<c08b0d58>] do_device_not_available+0x38/0x60
[ 0.000000] [<c08b0752>] error_code+0x5a/0x60
[ 0.000000] [<c08b0d20>] ? do_debug+0x160/0x160
[ 0.000000] [<c08a29e7>] ? fpu_init+0xd9/0xf7
[ 0.000000] [<c08a475e>] cpu_init+0x237/0x23f
[ 0.000000] [<c0b324f2>] trap_init+0x243/0x24b
[ 0.000000] [<c0b30759>] start_kernel+0x143/0x2d4
[ 0.000000] [<c0b302a0>] i386_start_kernel+0x76/0x7b

And I think this early crash bug was introduced 7+ years ago in March 2008, when
the FPU context area was separated from the task struct and its allocation went
dynamic:

61c4628b5386 ("x86, fpu: split FPU state from task struct - v5")

... but that's just a guess, I couldn't check that as kernels that far back don't
build and boot anymore with modern tooling.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/