Re: [PATCH RFC] x86: check for and defend against BIOS memorycorruption

From: Hugh Dickins
Date: Mon Sep 08 2008 - 07:36:21 EST


On Fri, 29 Aug 2008, Jeremy Fitzhardinge wrote:
> Hugh Dickins wrote:
>
> > Is this the right moment for me to mention again that I'm not sure
> > your reuse of existing pagetables was quite right anyway: NX being
> > excluded from level2_ident_pgt, but wanted in the direct map?
>
> We could add NX. What's the behaviour of setting NX in a
> non-NX-supporting CPU? I don't think it would trigger a "reserved bit"
> exception (the other high pte flags don't). Or failing that, we could
> mask out NX once we've worked out the CPU doesn't support it (at the
> same time it relocates the pagetables to the kernel's load-time address).

I've no experience of what happens if NX is set to a non-NX-supporting
CPU, so can't advise on that at all. But I think you're looking at
it the wrong way round - or else I am.

Here's the declaration and comment on level2_ident_pgt in head_64.S:

NEXT_PAGE(level2_ident_pgt)
/* Since I easily can, map the first 1G.
* Don't set NX because code runs from these pages.
*/
PMDS(0, __PAGE_KERNEL_LARGE_EXEC, PTRS_PER_PMD)

(The "Since I easily can" comment is there because it used to map
only a subset needed for early kernel startup, not the whole 1G.)

So it's very intentionally leaving NX out there. I believe the
level2_ident_pgt page appears twice or more in the pagetable layout,
used to map two or more areas of virtual address space - once to
provide the direct map at ffff880000000000 and once to provide
the kernel image virtual mapping at ffffffff80200000.

I think we don't want NX on Linux kernel text ;-?

Before your 2.6.27-rc changes, init_memory_mapping subsequently
replaced the direct map usage by a separately constructed pagetable,
similar to it but with NX set throughout. After your 2.6.27-rc
changes, level2_ident_pgt is found there already so left untouched -
leaving NX out of that first 1G of the direct map forever (when CPA
splits it up, the smaller pages inherit the lack of NX too).

I noticed when changing /proc/meminfo to show DirectMap in kB,
and tried to fix it, but only gave myself a non-booting system
(I probably shrank my direct map to 0 while implicitly using it).
And at that stage I didn't realize at all that it was a recent
regression arising from your mods.

I'm reluctant to delve in there again at present, and unsure
what the right fix should be: perhaps the code which checks if
an entry is already there, should check if it has the desired
flags?

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/