Re: BUG: Global FPU corruption in 2.2

From: David Konerding (dek_ml@konerding.com)
Date: Sun Apr 22 2001 - 13:39:27 EST


Ulrich Drepper wrote:

> "Richard B. Johnson" <root@chaos.analogic.com> writes:
>
> > The kernel doesn't know if a process is going to use the FPU when
> > a new process is created. Only the user's code, i.e., the 'C' runtime
> > library knows.
>
> Maybe you should try to understand the kernel code and the features of
> the processor first. The kernel can detect when the FPU is used for
> the first time.

OK, regardless of how the linux kernel actually manages the FPU for user-space

programs, does anybody have any comments on the original bugreport?

>We have found that one of our programs can cause system-wide
>corruption of the x86 FPU under 2.2.16 and 2.2.17. That is, after we
>run this program, the FPU gives bad results to all subsequent
>processes.

>We see this problem on dual 550MHz Xeons with 1GB RAM. We have 64 of
>these things, and we see the problem on every node we try (dozens).
>We don't have other SMPs handy. Uniprocessors, including other PIIIs,
>don't seem to be affected.

>Below are two programs we use to produce the behavior. The first
>program, pi, repeatedly spawns 10 parallel computations of pi. When
>all is well, each process prints pi as it completes.

>The second program, pt, repeatedly attaches to and detaches from
>another process. Run pt against the root pi process until the output
>of pi begins to look wrong. Then kill everything and run pi by itself
>again. It will no longer produce good results. We find that the FPU
>persistently gives bad results until we reboot.

I tried this on my dual PIII-600 runnng 2.2.19 and got exactly the behavior
described.
If it is a bug in the linux kernel (I can see nothing wrong with the source
code provided),
I would suspect probems with SMP and ptrace, somehow causing the wrong FP
registers
to be returned to a process after the scheduler restarted it. It's very
interesting that the
PI program works fine until you run PT, but after you run PT, PI is screwed
until reboot.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 23 2001 - 21:00:43 EST