Re: [tip:x86/security] x86: Add NX protection for kernel data

From: matthieu castet
Date: Thu Jan 20 2011 - 15:23:31 EST


This is a multi-part message in MIME format.Konrad Rzeszutek Wilk a écrit :
On Thu, Jan 20, 2011 at 03:37:36PM +0000, Ian Campbell wrote:
On Thu, 2011-01-20 at 15:06 +0000, Konrad Rzeszutek Wilk wrote:
On Thu, Jan 20, 2011 at 12:18:26PM +0100, castet.matthieu@xxxxxxx wrote:
Quoting Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>:

On Wed, Jan 19, 2011 at 11:59:57PM +0100, matthieu castet wrote:
Le Wed, 19 Jan 2011 16:14:32 -0500,
Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> a écrit :
I was just shown this[1] on Xen from an Ubuntu bug report[2].

[ 1.230382] NX-protecting the kernel data: 3884k
[ 1.231002] BUG: unable to handle kernel paging request at
c1782ae0 ...
[ 1.231145] Call Trace:
[ 1.231152] [<c0138481>] ? __change_page_attr+0x2c1/0x370
[ 1.231161] [<c02163a1>] ? __purge_vmap_area_lazy+0xc1/0x180
[ 1.231169] [<c013857c>] ?
__change_page_attr_set_clr+0x4c/0xb0 [ 1.231176]
[<c0138838>] ? change_page_attr_set_clr+0x128/0x300
[ 1.231183] [<c010798e>] ?
__raw_callee_save_xen_restore_fl+0x6/0x8 [ 1.231192]
[<c0159ca1>] ? vprintk+0x171/0x3f0 [ 1.231198] [<c0138bdf>] ?
set_memory_nx+0x5f/0x70
If you run it with Xen debugging enabled:

[ 7.753329] NX-protecting the kernel data: 2400k
(XEN) mm.c:2389:d0 Bad type (saw 3c000003 != exp 70000000) for mfn
this happen if (x & (PGT_type_mask|PGT_pae_xen_l2)) != type)

but
#define PGT_type_mask (7U<<29) /* Bits 29-31. */
#define _PGT_pae_xen_l2 26
#define PGT_pae_xen_l2 (1U<<_PGT_pae_xen_l2)

but (exp type = 0x70000000) & (PGT_type_mask|PGT_pae_xen_l2) =
0x60000000

So the exp type look strange.
#define _PGT_pinned 28
#define PGT_pinned (1U<<_PGT_pinned)

1355a5 (pfn 15a5) (XEN) mm.c:889:d0 Error getting mfn 1355a5 (pfn
15a5) from L1 entry 80000001355a5063 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4958:d0 ptwr_emulate: could not get_page_from_l1e()
[ 7.759087] BUG: unable to handle kernel paging request at
c82a4d28 [ 7.759087] IP: [<c100608c>]
xen_set_pte_atomic+0x21/0x2f [ 7.759087] *pdpt =
0000000001663001 *pde = 00000000082db067 *pte = 80000000082a4061 ..
and same stack trace.


Does Xen have different size page table allocations or something
weird?
The same page size. Not sure actually why it is being triggered.
Let me copy Keir on this. Keir, the region that is being marked as
_NX is .bss one and
_past_ the __init_end it dies. Any ideas?

Does this happen if you add ". = ALIGN(HPAGE_SIZE);" before bss section
in arch/x86/kernel/vmlinux.lds.S ?
Like this?
Yes
yeeeey...That made it boot.

What's the output of kernel_page_tables debugfs ?
Shees.. I get

[ 73.723105] BUG: unable to handle kernel paging request at 15555000
[...]
with the patch and if I revert 5bd5a452662bc37c54fb6828db1a3faf87e6511c..

That looks to be another bug to hunt down.

No that the same bug : that the root cause.

For some reason with xen, accessing some page tables (bss and after) make the
system crash.
I think I know the failure in the first case - the swapper_pg_dir is marked as _RO
and you are not suppose to make it _RW (unless you first do a bit of dance and switch
over to another pagetable). The reason being that Xen has a symbiotic relationship
with PV domains where pagetables are marked _RO so that any update to
it will go through Xen so it can validate that we aren't doing anything stupid.

But accessing the page table should be OK, not sure why it crashed - we
aren't writting anything to it - just reading.

Let me copy Ian on this - he might have better ideas.
It's pretty hard to follow the quoted context above but it certainly
seems plausible that set_memory_nx could inadvertently end up trying to
make a page which Xen made RO into a RW again.

For example the callchain appear to pass through static_protections()
which explicitly makes .data and .bss writeable, I think these regions
can potentially contain page table pages -- e.g. allocated from BRK
perhaps?

They definitly do - it has the level1_ident_pgt, which is definitly used
during bootup.

Ok that make sense
Perhaps the fix is when marking NX, just do NX, don't try to set RW if they
are RO.

What do you think of this patch ?


Matthieu