Hi Stephen,
You may want to CC intel-gfx@xxxxxxxxxxxxxxxxxxxxx for i915 issues (even
if you are not subscribed and you mail will wait for a moderator to let
it go through).
In case of intel GPU hangs you should at least include
/sys/kernel/debug/dri/0/i915_error_state, probably submitting as a
bug report on bugs.freedesktop.org due to its size.
If you have any indication on what triggers the hang, please add!
Bruno
On Sun, 17 November 2013 Stephen Clark<sclark46@xxxxxxxxxxxxx> wrote:Hi List,
I am getting this in kernel 3.11 x86_64
Nov 17 18:56:19 joker4 kernel: [drm:i915_hangcheck_elapsed] *ERROR* stuck on
render ring
Nov 17 18:56:19 joker4 kernel: [drm] capturing error event; look for more
information in /sys/kernel/debug/dri/0/i915_error_state
Nov 17 18:56:19 joker4 kernel: swapper/1: page allocation failure: order:6,
mode:0x200020
Nov 17 18:56:19 joker4 kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted
3.11.6-1.el6.elrepo.x86_64 #1
Nov 17 18:56:19 joker4 kernel: Hardware name: To Be Filled By O.E.M. Z96F/Z96F,
BIOS 080012 08/29/2006
Nov 17 18:56:19 joker4 kernel: 0000000000000006 ffff8800b73038e0
ffffffff815f7f89 0000000000000010
Nov 17 18:56:19 joker4 kernel: 0000000000200020 ffff8800b7303970
ffffffff8114243d ffff8800b778ab28
Nov 17 18:56:19 joker4 kernel: 0000003000000001 ffff8800b7789000
0000000000000000 0000000600000002
Nov 17 18:56:19 joker4 kernel: Call Trace:
Nov 17 18:56:19 joker4 kernel:<IRQ> [<ffffffff815f7f89>] dump_stack+0x49/0x60
Nov 17 18:56:19 joker4 kernel: [<ffffffff8114243d>] warn_alloc_failed+0xfd/0x160
Nov 17 18:56:19 joker4 kernel: [<ffffffff8114e98c>] ? wakeup_kswapd+0x10c/0x140
Nov 17 18:56:19 joker4 kernel: [<ffffffff811455ae>]
__alloc_pages_slowpath+0x4ae/0x7c0
Nov 17 18:56:19 joker4 kernel: [<ffffffff81142d9d>] ?
get_page_from_freelist+0x2dd/0x710
Nov 17 18:56:19 joker4 kernel: [<ffffffff81145bce>]
__alloc_pages_nodemask+0x30e/0x330
Nov 17 18:56:19 joker4 kernel: [<ffffffff8118c437>] kmem_getpages+0x67/0x1e0
Nov 17 18:56:19 joker4 kernel: [<ffffffff8118dea9>] fallback_alloc+0x189/0x270
Nov 17 18:56:19 joker4 kernel: [<ffffffff8118dc55>] ____cache_alloc_node+0x95/0x160
Nov 17 18:56:19 joker4 kernel: [<ffffffff8118e9b7>] __kmalloc+0x177/0x2c0
Nov 17 18:56:19 joker4 kernel: [<ffffffffa0044a29>] ?
i915_capture_error_state+0x379/0x720 [i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffffa0044a29>]
i915_capture_error_state+0x379/0x720 [i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffffa0044dfb>] i915_handle_error+0x2b/0x80
[i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffffa004511e>]
i915_hangcheck_elapsed+0x2ce/0x350 [i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffff8101b019>] ? sched_clock+0x9/0x10
Nov 17 18:56:19 joker4 kernel: [<ffffffff8109d905>] ? sched_clock_local+0x25/0x90
Nov 17 18:56:19 joker4 kernel: [<ffffffff814711f0>] ? usb_add_hcd+0x3d0/0x3d0
Nov 17 18:56:19 joker4 kernel: [<ffffffffa0044e50>] ?
i915_handle_error+0x80/0x80 [i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffff81073b19>] call_timer_fn+0x49/0x120
Nov 17 18:56:19 joker4 kernel: [<ffffffff8107470b>] run_timer_softirq+0x23b/0x2a0
Nov 17 18:56:19 joker4 kernel: [<ffffffff812b2660>] ? timerqueue_add+0x60/0xb0
Nov 17 18:56:19 joker4 kernel: [<ffffffffa0044e50>] ?
i915_handle_error+0x80/0x80 [i915]
Nov 17 18:56:19 joker4 kernel: [<ffffffff8106c147>] __do_softirq+0xf7/0x270
Nov 17 18:56:19 joker4 kernel: [<ffffffff8108e0c3>] ? hrtimer_interrupt+0x163/0x260
Nov 17 18:56:19 joker4 kernel: [<ffffffff81606adc>] call_softirq+0x1c/0x30
Nov 17 18:56:19 joker4 kernel: [<ffffffff81015885>] do_softirq+0x65/0xa0
Nov 17 18:56:19 joker4 kernel: [<ffffffff8106be75>] irq_exit+0xc5/0xd0
Nov 17 18:56:19 joker4 kernel: [<ffffffff8160757a>]
smp_apic_timer_interrupt+0x4a/0x5a
Nov 17 18:56:19 joker4 kernel: [<ffffffff81605e1d>] apic_timer_interrupt+0x6d/0x80
Nov 17 18:56:19 joker4 kernel:<EOI> [<ffffffff810bb1aa>] ?
cpu_idle_loop+0x10a/0x210
Nov 17 18:56:19 joker4 kernel: [<ffffffff810bb17c>] ? cpu_idle_loop+0xdc/0x210
Nov 17 18:56:19 joker4 kernel: [<ffffffff810bb320>] cpu_startup_entry+0x70/0x80
Nov 17 18:56:19 joker4 kernel: [<ffffffff810437bd>] start_secondary+0xcd/0xd0
Nov 17 18:56:19 joker4 kernel: SLAB: Unable to allocate memory on node 0 (gfp=0x20)
Nov 17 18:56:19 joker4 kernel: cache: kmalloc-262144, object size: 262144, order: 6
Nov 17 18:56:19 joker4 kernel: node 0: slabs: 0/0, objs: 0/0, free: 0
Nov 17 18:56:19 joker4 kernel: [drm:i915_set_reset_status] *ERROR* render ring
hung inside bo (0x85c000 ctx 0) at 0x85c97c
is this fixed in 3.12?
Just checked get the same thing in 3.12 but no trace back.
Nov 17 19:41:33 joker4 kernel: [drm] stuck on render ring
Nov 17 19:41:33 joker4 kernel: [drm] capturing error event; look for more
information in /sys/class/drm/card0/error
Nov 17 19:41:33 joker4 kernel: [drm:i915_set_reset_status] *ERROR* render ring
hung inside bo (0x7214000 ctx 0) at 0x72142e0
Nov 17 19:41:33 joker4 kernel: [drm:i915_reset] *ERROR* Failed to reset chip.
Thanks,
Steve