Re: BUG: Global FPU corruption in 2.2

From: Linus Torvalds (torvalds@transmeta.com)
Date: Tue Apr 24 2001 - 11:24:20 EST


[ Alan, I'm lazy and only have 2.2.14 sources on-line. Maybe this has
  been fixed already and there's something else going on. Worth a look ]

In article <cpxu23etpmc.fsf@goat.cs.wisc.edu>,
Victor Zandy <zandy@cs.wisc.edu> wrote:
>
>Someone else here traced the process flags of a FP-intensive program
>on a machine before and after it is put in the faulty FPU state. He
>periodically sampled /proc/pid/stat while the program was running.
>
>He found that PF_USEDFPU was always set before the machine was broken.
>After he found that it was set about 70% of the time.

[ Looks closer at the ptrace synchronization ]

Ahh.. This actually _does_ look like a race on "current->flags":
PTRACE_ATTACH will do a

        child->flags |= PF_PTRACED;

without waiting for the child to have stopped.

(Aside: thinking more about the stopping logic - I'm not actually sure
the ptrace synchronization is complete wrt scheduling, as there will be
a window when the process has set the task state to TASK_STOPPED but
hasn't actually yet scheduled away. Oh, well).

All other ptrace operations (not counting killing the child) will check
that the child is quiescent. But PTRACE_ATTACH will not, as we're just
setting up the stopping.

In 2.4.x, this bug doesn't happen because "flags" was split up into
"current->ptrace" and "current->flags". Exactly because of locking
concerns.

                        Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 30 2001 - 21:00:12 EST