Re: [Xen-devel] Linux 4.4 MW: Boot under Xen fails with CONFIG_DEBUG_WX enabled: RIP: ptdump_walk_pgd_level_core

From: Sander Eikelenboom
Date: Thu Nov 05 2015 - 04:15:44 EST


On 2015-11-05 00:13, Boris Ostrovsky wrote:
On 11/04/2015 03:02 PM, Sander Eikelenboom wrote:
On 2015-11-04 19:47, Stephen Smalley wrote:
On 11/04/2015 01:28 PM, Sander Eikelenboom wrote:
On 2015-11-04 16:52, Stephen Smalley wrote:
On 11/04/2015 06:55 AM, Sander Eikelenboom wrote:
Hi All,

I just tried to boot with the current linus mergewindow tree under Xen.
It fails with a kernel panic at boot with the new "CONFIG_DEBUG_WX"
option enabled.
Disabling it makes the kernel boot fine.

The splat:
[ 18.424241] Freeing unused kernel memory: 1104K (ffffffff822fc000 -
ffffffff82410000)
[ 18.430314] Write protecting the kernel read-only data: 18432k
[ 18.441054] Freeing unused kernel memory: 1144K (ffff880001ae2000 -
ffff880001c00000)
[ 18.447966] Freeing unused kernel memory: 1560K (ffff88000207a000 -
ffff880002200000)
[ 18.453947] BUG: unable to handle kernel paging request at
ffff88055c883000
[ 18.459943] IP: [<ffffffff8105af8e>]
ptdump_walk_pgd_level_core+0x20e/0x440
[ 18.465847] PGD 2212067 PUD 0
[ 18.471564] Oops: 0000 [#1] SMP
[ 18.477248] Modules linked in:
[ 18.482918] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
4.3.0-mw-20151104-linus-doflr+ #1
[ 18.488804] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS
V1.8B1 09/13/2010
[ 18.494778] task: ffff880059b90000 ti: ffff880059b98000 task.ti:
ffff880059b98000
[ 18.500852] RIP: e030:[<ffffffff8105af8e>] [<ffffffff8105af8e>]
ptdump_walk_pgd_level_core+0x20e/0x440
[ 18.507102] RSP: e02b:ffff880059b9be48 EFLAGS: 00010296
[ 18.513351] RAX: ffff88055c883000 RBX: ffffffff81ae2000 RCX:
ffff880000000000
[ 18.519733] RDX: 0000000000000067 RSI: ffff880059b9be98 RDI:
ffff880000001000
[ 18.526129] RBP: ffff880059b9bf00 R08: 0000000000000000 R09:
0000000000000000
[ 18.532522] R10: ffff88005fd0e790 R11: 0000000000000001 R12:
ffff880080000000
[ 18.538891] R13: ffffc00000000fff R14: ffff880059b9be98 R15:
0000000000000000
[ 18.545247] FS: 0000000000000000(0000) GS:ffff88005f680000(0000)
knlGS:0000000000000000
[ 18.551708] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 18.558153] CR2: ffff88055c883000 CR3: 0000000002211000 CR4:
0000000000000660
[ 18.564686] Stack:
[ 18.571106] 0000000159b9be50 ffffffff82211000 ffff88055c884000
0000000000000800
[ 18.577704] 0000800000000000 ffff88055c883000 0000000000000007
ffff88005fd0e790
[ 18.584291] ffff880059b9bed8 ffffffff81156ace 0000000000000001
0000000000000000
[ 18.590916] Call Trace:
[ 18.597458] [<ffffffff81156ace>] ? free_reserved_area+0x11e/0x120
[ 18.604180] [<ffffffff8105b1e2>]
ptdump_walk_pgd_level_checkwx+0x12/0x20
[ 18.611014] [<ffffffff810515b9>] mark_rodata_ro+0xe9/0xf0
[ 18.617819] [<ffffffff81ad3380>] ? rest_init+0x80/0x80
[ 18.624512] [<ffffffff81ad3398>] kernel_init+0x18/0xe0
[ 18.631095] [<ffffffff81adadcf>] ret_from_fork+0x3f/0x70
[ 18.637650] [<ffffffff81ad3380>] ? rest_init+0x80/0x80
[ 18.644178] Code: 70 ff ff ff 48 3b 85 58 ff ff ff 0f 84 c0 fe ff ff
48 8b 85 68 ff ff ff 48 c1 e0 10 48 c1 f8 10 48 89 45 b0 48 8b 85 70 ff
ff ff <48> 8b 38 48 85 ff 0f 85 4e ff ff ff b9 02 00 00 00 31 d2 4c 89
[ 18.658246] RIP [<ffffffff8105af8e>]
ptdump_walk_pgd_level_core+0x20e/0x440
[ 18.665211] RSP <ffff880059b9be48>
[ 18.672073] CR2: ffff88055c883000
[ 18.678852] ---[ end trace d84e34461c40637a ]---
[ 18.685641] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00000009
[ 18.685641]
[ 18.699520] Kernel Offset: disable


What's your .config? Does cat /sys/kernel/debug/kernel_page_tables
produce a similar fault even with CONFIG_DEBUG_WX=n?

.config is attached

Hmm that sysfs file doesn't seem to exist then:
# cat /sys/kernel/debug/kernel_page_tables
cat: /sys/kernel/debug/kernel_page_tables: No such file or directory

Needs CONFIG_X86_PTDUMP=y.
Also assumes you have debugfs mounted there.

Recompiled, and the result is that it also blows up:


Can you try this:


diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 1bf417e..b534216 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -362,8 +362,13 @@ static void ptdump_walk_pgd_level_core(struct
seq_file *m, pgd_t *pgd,
bool checkwx)
{
#ifdef CONFIG_X86_64
+/* ffff800000000000 - ffff87ffffffffff is reserved for hypervisor */
+#define is_hypervisor_range(idx) (paravirt_enabled() && \
+ ((idx >= pgd_index(__PAGE_OFFSET) - 16) && \
+ (idx < pgd_index(__PAGE_OFFSET))))
pgd_t *start = (pgd_t *) &init_level4_pgt;
#else
+#define is_hypervisor_range(idx) 0
pgd_t *start = swapper_pg_dir;
#endif
pgprotval_t prot;
@@ -381,7 +386,7 @@ static void ptdump_walk_pgd_level_core(struct
seq_file *m, pgd_t *pgd,

for (i = 0; i < PTRS_PER_PGD; i++) {
st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
- if (!pgd_none(*start)) {
+ if (!pgd_none(*start) && !is_hypervisor_range(i)) {
if (pgd_large(*start) || !pgd_present(*start)) {
prot = pgd_flags(*start);
note_page(m, &st, __pgprot(prot), 1);

Hi Boris,

Thank for your patch !
It makes "cat /sys/kernel/debug/kernel_page_tables" work and
prevents a kernel with CONFIG_DEBUG_WX=y from crashing at boot.

It now does give a warning about an insecure W+X mapping, so CONFIG_DEBUG_WX=y
seems to be working. No idea how to interpret it though (and if it's a legit
warning).

--
Sander

[ 19.034706] Freeing unused kernel memory: 1104K (ffffffff822fc000 - ffffffff82410000)
[ 19.041339] Write protecting the kernel read-only data: 18432k
[ 19.052596] Freeing unused kernel memory: 1144K (ffff880001ae2000 - ffff880001c00000)
[ 19.060285] Freeing unused kernel memory: 1560K (ffff88000207a000 - ffff880002200000)
[ 19.067079] ------------[ cut here ]------------
[ 19.073931] WARNING: CPU: 5 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x619/0x7e0()
[ 19.081039] x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000
[ 19.088293] Modules linked in:
[ 19.095477] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.3.0-mw-20151104-linus-doflr-withdebugwx-noptdump-ptdumppatch+ #1
[ 19.102971] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 19.110594] ffffffff81f381d3 ffff880059b9bd60 ffffffff81428992 ffff880059b9bda8
[ 19.118239] ffff880059b9bd98 ffffffff810ca7ec ffff8800025d8010 0000000000000004
[ 19.125870] 0010000000000067 ffff880059b9be98 0000000000000000 ffff880059b9bdf8
[ 19.133526] Call Trace:
[ 19.141212] [<ffffffff81428992>] dump_stack+0x44/0x62
[ 19.148983] [<ffffffff810ca7ec>] warn_slowpath_common+0x7c/0xb0
[ 19.156877] [<ffffffff810ca867>] warn_slowpath_fmt+0x47/0x50
[ 19.164813] [<ffffffff8105abb9>] note_page+0x619/0x7e0
[ 19.172656] [<ffffffff8105b135>] ptdump_walk_pgd_level_core+0x3b5/0x450
[ 19.180583] [<ffffffff8105b1f2>] ptdump_walk_pgd_level_checkwx+0x12/0x20
[ 19.188427] [<ffffffff810515b9>] mark_rodata_ro+0xe9/0xf0
[ 19.196221] [<ffffffff81ad2f70>] ? rest_init+0x80/0x80
[ 19.204012] [<ffffffff81ad2f88>] kernel_init+0x18/0xe0
[ 19.211732] [<ffffffff81ada98f>] ret_from_fork+0x3f/0x70
[ 19.219467] [<ffffffff81ad2f70>] ? rest_init+0x80/0x80
[ 19.227145] ---[ end trace 5fc0f4297b911570 ]---
[ 19.244382] x86/mm: Checked W+X mappings: FAILED, 4602 W+X pages found.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/