Re: run_timer_softirq gpf. tracing?

From: Dave Jones
Date: Tue Mar 21 2017 - 15:45:25 EST


On Tue, Mar 21, 2017 at 08:25:39PM +0100, Thomas Gleixner wrote:

> > RAX looks like list poison, and CR2 = 4, which is likely the ->next of a list,
> > with a NULL pointer.
>
> Certainly not on 64 bit. that would be 8. And CR2 is irrelevant here
> because that's a #GP not a #PF.

doh!

> The timer which expires has timer->entry.next == POISON2 !
>
> This has nothing to do with tracing, it's a classic list corruption. The
> bad news is that there is no trace of the culprit because that happens when
> some other timer expires after some random amount of time.

ah! Thanks for putting me back on the right path.

> If that is reproducible, then please enable debugobjects. That should
> pinpoint the culprit.

hit it twice today so far, so hopefully it'll reproduce.

thanks.

Dave