Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state changes

From: Vegard Nossum
Date: Fri Aug 08 2008 - 14:31:40 EST


On Wed, Aug 6, 2008 at 11:21 PM, Suresh Siddha
<suresh.b.siddha@xxxxxxxxx> wrote:
> BTW, in one of your oops, I see:
>
> note: cron[1207] exited with preempt_count 268435459
>
> I smell some kind of stack corruption here which is corrupting
> thread_info (in the above case preempt_count in the thread_info).
>
> Similarly, if the status field(in thread_info) gets corrupted(setting
> TS_USEDFPU) without proper math state allocated(present in thread_struct),
> we can end up oops in __switch_to.
>
> But you seem to say, reverting recent fpu patches make the problem go away.
> hmm, just wondering if your test kernel (with fpu patches reverted) is stable
> enough and don't see other oops/issues?
>
> Recently Vegard also noticed some stack corruptions (in network stack) leading
> to similar problems. Not sure if Vegard has root caused his issue. copying him
> for his comments.

I don't think this is the same problem. What I see is almost certainly
a problem with netpoll, netconsole, or the 8139too driver. I see a UDP
packet in a task_struct.

There is also the fact that reverting fpu patches makes it go away
(for Wolfgang), while for the issue I am seeing, oops in FP code is
just one out of several different corruptions (sometimes it happens in
other slabs).

(Sorry for the little late reply.)

Thanks.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/