Re: commit 444d13ff10f introduced boot failure on s390x

From: Jessica Yu
Date: Mon Aug 15 2016 - 15:13:02 EST


+++ Jessica Yu [10/08/16 18:58 -0400]:
+++ Eryu Guan [10/08/16 23:21 +0800]:
Hi,

I hit boot failure on s390x host starting from 4.8-rc1 kernel, 4.7
kernel works fine. And I bisected to this commit 444d13ff10fb

commit 444d13ff10fb13bc3e64859c3cf9ce43dcfeb075
Author: Jessica Yu <jeyu@xxxxxxxxxx>
Date: Wed Jul 27 12:06:21 2016 +0930

modules: add ro_after_init support

Add ro_after_init support for modules by adding a new page-aligned section
in the module layout (after rodata) for ro_after_init data and enabling RO
protection for that section after module init runs.

Signed-off-by: Jessica Yu <jeyu@xxxxxxxxxx>
Acked-by: Kees Cook <keescook@xxxxxxxxxxxx>
Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>

and I've only hit this panic on s390x hosts. Console log is appended at
the end of email.

Thanks,
Eryu

Hi Eryu, thanks for reporting this. It's a bit difficult to tell from
the stacktrace alone what's really going on, so I'll attempt to
reproduce this on a 4.8-rc1 kernel once I get my hands on an s390x
system and report back.

[ CC'ing Heiko and Martin ]

So this panic is related to some recent changes to set_memory_{ro,rw}
on s390x, see commit e8a97e42 "s390/pageattr: allow kernel page table
splitting." The new implementation of set_memory_{ro,rw} on s390 isn't
handling the case when numpages is 0.

Recall the general layout of a module:
[text] [rodata] [ro-after-init] [writable data]

Normally a module's ro after init section sits between rodata and
writable data. When a module doesn't have a ro after init section,
set_memory_ro gets called with the first page-aligned addr after
rodata, but with numpages = 0. However in this case since
set_memory_ro isn't handling the case when numpages is 0, it
incorrectly ends up walking the page table anyway and ends up setting
a normally writable page to ro. Adding a simple numpages == 0 check
to set_memory_{ro,rw} and returning fixes the panic.

Jessica

[ 2.050197] device-mapper: uevent: version 1.0.3
[ 2.050370] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: dm-d
evel@xxxxxxxxxx
[ 2.057615] Unable to handle kernel pointer dereference in virtual kernel add
ress space
[ 2.057619] Failing address: 000003ff8001d000 TEID: 000003ff8001d407
[ 2.057620] Fault in home space mode while using kernel ASCE.
[ 2.057622] AS:0000000000a7c007 R3:000000007c974007 S:000000007cc24800 P:0000
00000239b21d
[ 2.057665] Oops: 0004 ilc:3 [#1] SMP
[ 2.057667] Modules linked in: dm_mod
[ 2.057670] CPU: 0 PID: 399 Comm: modprobe Not tainted 4.7.0+ #7
[ 2.057672] Hardware name: IBM 2827 H43 400
(z/VM)
[ 2.057673] task: 000000007cccd100 ti: 0000000002324000 task.ti: 000000000232
4000
[ 2.057675] Krnl PSW : 0704c00180000000 000000000043a5c8 (__list_add_rcu+0x50
/0xa8)
[ 2.057683] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:
0 EA:3
Krnl GPRS: 0000000000000006 000000000098b278 000003ff80117208 000000000098b278
[ 2.057685] 000003ff8001df08 00000000001c913c 0000000000000002 000
000008001e55c
[ 2.057686] 0000000000000560 0000000002327e00 000003ff80117218 000
003ff80117208
[ 2.057687] 000000000098b278 000003ff8001df08 0000000002327cc0 000
0000002327c90
[ 2.057696] Krnl Code: 000000000043a5b6: e3d0b0000024 stg %r13,0(%
r11)
000000000043a5bc: e3c0b0080024 stg %r12,8(%r11)
#000000000043a5c2: e3b0c0000024 stg %r11,0(%r12)
>000000000043a5c8: e3b0d0080024 stg %r11,8(%r13)
000000000043a5ce: e340f0b80004 lg %r4,184(%r15)
000000000043a5d4: ebbff0a00004 lmg %r11,%r15,160(%r15)
000000000043a5da: 07f4 bcr 15,%r4
000000000043a5dc: e34040080004 lg %r4,8(%r4)
[ 2.057706] Call Trace:
[ 2.057708] ([<0000000002327cc0>] 0x2327cc0)
[ 2.057714] ([<00000000001c98d0>] load_module+0x8e0/0x1870)
[ 2.057715] ([<00000000001caa74>] SyS_finit_module+0xb4/0xf0)
[ 2.057720] ([<00000000006678b6>] system_call+0xd6/0x264)
[ 2.057721] Last Breaking-Event-Address:
[ 2.057722] [<00000000001c98ca>] load_module+0x8da/0x1870
[ 2.057723]
[ 2.057724] Kernel panic - not syncing: Fatal exception: panic_on_oops