Panic with bochs_drm module on qemu-system-sparc64

From: Mark Cave-Ayland
Date: Fri Jun 30 2017 - 00:54:49 EST


Hi all,

I'm one of the QEMU SPARC maintainers and I've been investigating why
enabling the fb console via the bochs_drm module causes a panic on
startup. The reproducer with QEMU 2.9 is easy:

$ ./qemu-system-sparc64 -m 512 -kernel rel-sparc/vmlinux -append
'console=ttyS0' -serial stdio

This gives the following panic on the serial console:

[ 14.759388] [drm] Found bochs VGA, ID 0xb0c5.
[ 14.760018] [drm] Framebuffer size 16384 kB @ 0x1ff01000000, mmio @
0x1ff02000000.
[ 14.763370] [TTM] Zone kernel: Available graphics memory: 252808 kiB
[ 14.764240] [TTM] Initializing pool allocator
[ 14.894178] Unable to handle kernel paging request at virtual address
000001ff01000000
[ 14.894247] tsk->{mm,active_mm}->context = 0000000000000000
[ 14.894308] tsk->{mm,active_mm}->pgd = fffff80000402000
[ 14.894372] \|/ ____ \|/
[ 14.894372] "@'/ .. \`@"
[ 14.894372] /_| \__/ |_\
[ 14.894372] \__U_/
[ 14.894435] swapper/0(1): Oops [#1]
[ 14.895400] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc7+ #45
[ 14.895634] task: fffff8001c097800 task.stack: fffff8001c09c000
[ 14.895722] TSTATE: 0000000080001607 TPC: 00000000006f3c24 TNPC:
00000000006f3c30 Y: 00000000 Not tainted
[ 14.896697] TPC: <sys_imageblit+0x1dc/0x438>
[ 14.896916] g0: 0000000000000080 g1: fffff8001c360000 g2:
0000000000000000 g3: 0000000000000007
[ 14.896976] g4: fffff8001c097800 g5: fffff8001ecea000 g6:
fffff8001c09c000 g7: 000000000090dfc0
[ 14.897036] o0: 0000000000000026 o1: 0000000000000001 o2:
000000000000000f o3: 000001ff01000000
[ 14.897094] o4: 0000000000000007 o5: 0000000000000008 sp:
fffff8001c09e0a1 ret_pc: 0000000000000001
[ 14.897551] RPC: <0x1>
[ 14.897715] l0: 0000000000000020 l1: 0000000000a43800 l2:
0000000000000080 l3: 0000000000000001
[ 14.897773] l4: 0000000000b93800 l5: 0000000000000080 l6:
0000000000000030 l7: 0000000000000000
[ 14.897830] i0: fffff8001c32c000 i1: fffff8001c360000 i2:
0000000000000000 i3: 0000000000000000
[ 14.897887] i4: 0000000000000001 i5: 000001ff01000000 i6:
fffff8001c09e151 i7: 00000000007381d4
[ 14.897984] I7: <drm_fb_helper_sys_imageblit+0x14/0x34>
[ 14.898166] Call Trace:
[ 14.898442] [00000000007381d4] drm_fb_helper_sys_imageblit+0x14/0x34
[ 14.898561] [00000000006e6c9c] soft_cursor+0x174/0x19c
[ 14.898601] [00000000006e6744] bit_cursor+0x45c/0x490
[ 14.898641] [00000000006e324c] fbcon_cursor+0x16c/0x17c
[ 14.898685] [0000000000710f60] hide_cursor+0x2c/0xa8
[ 14.898724] [00000000007120fc] redraw_screen+0xc4/0x208
[ 14.898765] [00000000006e2328] fbcon_prepare_logo+0x288/0x358
[ 14.898803] [00000000006e27cc] fbcon_init+0x3d4/0x448
[ 14.898844] [00000000007113e4] visual_init+0xa4/0x100
[ 14.898884] [0000000000712ba0] do_bind_con_driver+0x1c8/0x300
[ 14.898925] [000000000071304c] do_take_over_console+0x170/0x198
[ 14.898965] [00000000006e28c4] do_fbcon_takeover+0x84/0xe8
[ 14.899017] [00000000004747dc] notifier_call_chain+0x38/0x74
[ 14.899061] [0000000000474a5c] __blocking_notifier_call_chain+0x28/0x44
[ 14.899104] [00000000006ec214] register_framebuffer+0x2b8/0x2ec
[ 14.899147] [00000000007399f0] drm_fb_helper_initial_config+0x2d0/0x36c
[ 14.899294] Disabling lock debugging due to kernel taint
[ 14.899551] Caller[00000000007381d4]:
drm_fb_helper_sys_imageblit+0x14/0x34
[ 14.899656] Caller[00000000006e6c9c]: soft_cursor+0x174/0x19c
[ 14.899696] Caller[00000000006e6744]: bit_cursor+0x45c/0x490
[ 14.899735] Caller[00000000006e324c]: fbcon_cursor+0x16c/0x17c
[ 14.899774] Caller[0000000000710f60]: hide_cursor+0x2c/0xa8
[ 14.899812] Caller[00000000007120fc]: redraw_screen+0xc4/0x208
[ 14.899852] Caller[00000000006e2328]: fbcon_prepare_logo+0x288/0x358
[ 14.899891] Caller[00000000006e27cc]: fbcon_init+0x3d4/0x448
[ 14.899930] Caller[00000000007113e4]: visual_init+0xa4/0x100
[ 14.899970] Caller[0000000000712ba0]: do_bind_con_driver+0x1c8/0x300
[ 14.900018] Caller[000000000071304c]: do_take_over_console+0x170/0x198
[ 14.900061] Caller[00000000006e28c4]: do_fbcon_takeover+0x84/0xe8
[ 14.900132] Caller[00000000004747dc]: notifier_call_chain+0x38/0x74
[ 14.900218] Caller[0000000000474a5c]:
__blocking_notifier_call_chain+0x28/0x44
[ 14.900263] Caller[00000000006ec214]: register_framebuffer+0x2b8/0x2ec
[ 14.900306] Caller[00000000007399f0]:
drm_fb_helper_initial_config+0x2d0/0x36c
[ 14.900351] Caller[0000000000761e60]: bochs_fbdev_init+0x6c/0xb0
[ 14.900389] Caller[0000000000760b2c]: bochs_load+0x84/0xa8
[ 14.900439] Caller[0000000000741c88]: drm_dev_register+0x114/0x1e8
[ 14.900628] Caller[0000000000742a10]: drm_get_pci_dev+0xa8/0x118
[ 14.900672] Caller[00000000006c8364]: pci_device_probe+0x70/0xdc
[ 14.900713] Caller[0000000000768cf0]: driver_probe_device+0x148/0x2a4
[ 14.900752] Caller[0000000000768ec4]: __driver_attach+0x78/0xa8
[ 14.900790] Caller[00000000007674ec]: bus_for_each_dev+0x58/0x7c
[ 14.900830] Caller[00000000007683bc]: bus_add_driver+0xd0/0x1fc
[ 14.900869] Caller[0000000000769960]: driver_register+0xa8/0x100
[ 14.900913] Caller[0000000000426cb0]: do_one_initcall+0x80/0x10c
[ 14.900999] Caller[0000000000ad6bdc]: kernel_init_freeable+0x1a8/0x244
[ 14.901037] Caller[00000000008be94c]: kernel_init+0x4/0xfc
[ 14.901077] Caller[0000000000406064]: ret_from_fork+0x1c/0x2c

Looking at this in more detail we can see that the panic occurs when we
first touch the framebuffer memory as part of sys_imageblit() called via
drm_fb_helper_sys_imageblit() and it's caused by sys_imageblit()
dereferencing a pointer to write to the mapped framebuffer.

The bochs_drm driver itself uses a standard approach to map the
framebuffer like this in drivers/gpu/drm/bochs/bochs_hw.c (shortened for
clarity):

addr = pci_resource_start(pdev, 0);
size = pci_resource_len(pdev, 0);
...
...
bochs->fb_map = ioremap(addr, size);
if (bochs->fb_map == NULL) {
DRM_ERROR("Cannot map framebuffer\n");
return -ENOMEM;
}

The issue with SPARC64 systems is that the address returned by ioremap()
is actually a physical address as per this comment in
arch/sparc/include/asm/io_64.h:

/* On sparc64 we have the whole physical IO address space accessible
* using physically addressed loads and stores, so this does nothing.
*/
static inline void __iomem *ioremap(unsigned long offset, unsigned
long size)
{
return (void __iomem *)offset;
}

This means that unless accesses to the mapped framebuffer are done using
the standard readb/writeb/readw/writew/readl/writel functions which
force physical accesses bypassing the MMU, we end up accessing an
invalid unmapped address.

And it's evident that the code in drivers/video/fbdev/core/sysimgblt.c
doesn't use these accessor functions at all but dereferences the mapped
framebuffer pointer directly, hence causing the panic.

So I can see there are 2 potential issues here:

1) sys_imageblit() shouldn't be accessing ioremap()ped memory by
dereferencing a pointer

2) sys_imageblit() requires a virtual address while
drm_fb_helper_sys_imageblit() incorrectly assumes that any ioremap()ped
address is always virtual and passes it directly through

Note for LKML people: the list is high volume for me and so I'm not
subscribed, so please CC me directly on any reply.


Many thanks,

Mark.