run_timer_softirq gpf. tracing?

From: Dave Jones
Date: Tue Mar 21 2017 - 14:19:16 EST


I just hit this while fuzzing..

general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc2-think+ #1
task: ffff88017f0ed440 task.stack: ffffc90000094000
RIP: 0010:run_timer_softirq+0x15f/0x700
RSP: 0018:ffff880507c03ec8 EFLAGS: 00010086
RAX: dead000000000200 RBX: ffff880507dd0d00 RCX: 0000000000000002
RDX: ffff880507c03ed0 RSI: 00000000ffffffff RDI: ffffffff8204b3a0
RBP: ffff880507c03f48 R08: ffff880507dd12d0 R09: ffff880507c03ed8
R10: ffff880507dd0db0 R11: 0000000000000000 R12: ffffffff8215cc38
R13: ffff880507c03ed0 R14: ffffffff82005188 R15: ffff8804b55491a8
FS: 0000000000000000(0000) GS:ffff880507c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000004 CR3: 0000000005011000 CR4: 00000000001406e0
Call Trace:
<IRQ>
? clockevents_program_event+0x47/0x120
__do_softirq+0xbf/0x5b1
irq_exit+0xb5/0xc0
smp_apic_timer_interrupt+0x3d/0x50
apic_timer_interrupt+0x97/0xa0
RIP: 0010:cpuidle_enter_state+0x12e/0x400
RSP: 0018:ffffc90000097e40 EFLAGS: 00000202
[CONT START] ORIG_RAX: ffffffffffffff10
RAX: ffff88017f0ed440 RBX: ffffe8ffffa03cc8 RCX: 0000000000000001
RDX: 20c49ba5e353f7cf RSI: 0000000000000001 RDI: ffff88017f0ed440
RBP: ffffc90000097e80 R08: 00000000ffffffff R09: 0000000000000008
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005
R13: ffffffff820b9338 R14: 0000000000000005 R15: ffffffff820b9320
</IRQ>
cpuidle_enter+0x17/0x20
call_cpuidle+0x23/0x40
do_idle+0xfb/0x200
cpu_startup_entry+0x71/0x80
start_secondary+0x16a/0x210
start_cpu+0x14/0x14
Code: 8b 05 ce 1b ef 7e 83 f8 03 0f 87 4e 01 00 00 89 c0 49 0f a3 04 24 0f 82 0a 01 00 00 49 8b 07 49 8b 57 08 48 85 c0 48 89 02 74 04 <48> 89 50 08 41 f6 47 2a 20 49 c7 47 08 00 00 00 00 48 89 df 48


RAX looks like list poison, and CR2 = 4, which is likely the ->next of a list,
with a NULL pointer.

Here's the disassembly of timer.o
https://codemonkey.org.uk/junk/timer.o.txt

Sure enough, run_timer_softirq+0x15f (1ccf) is..


entry->next = LIST_POISON2;
1cc8: 48 b8 00 02 00 00 00 movabs $0xdead000000000200,%rax
1ccf: 00 ad de


This looks to be something to do with the tracing definition of timer_cancel.
I don't fully grok how the tracing macros work, is there something obvious
here before I dig deeper ?

Right before that list op, we do this..

if (timer->flags & TIMER_IRQSAFE) {

Which looks like a line from expire_timers. I'm not following how we get there
from that tracing definition.

No tracing was active when this happened.

Dave