Re: [Bug #11035] System hangs on 2.6.26-rc8

From: Vegard Nossum
Date: Fri Jul 18 2008 - 03:28:43 EST


On Fri, Jul 18, 2008 at 9:11 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
> Vegard - would it be possible to make DEBUG_PAGEALLOC faults single-shot
> and non-fatal, just like kmemcheck does it? That way people would see a
> nice kernel message instead of an immediate crash. That means we'd have
> to find a reliable filter for DEBUG_PAGEALLOC-provoked pagefaults though
> ...

Hm.. Yes, we could do it in a similar fashion using single-stepping.
It should take little effort; we already have most of the code to do
it; mmiotrace does the same thing too, after all.

These are some considerations:

1. If the page is kernel space but currently unmapped, does it point
to a valid page of RAM even though it is non-present?
2. Should we allow reading/writing of the underlying physical page (if
it exists), or should we prevent writes (i.e. allow the instruction to
proceed, but don't really write anything) and reads (i.e. allow the
instruction to read 0 or another magic number).

For the filter you mentioned, we could perhaps use one more bit in the
PTE. This is what we do for kmemcheck, and IIRC DEBUG_PAGEALLOC is
incompatible with kmemcheck anyway (I don't remember why exactly), so
we could reuse the same bit.

BTW, I didn't consider that argument (of continuing as far as
possible) before, but it's a good one; if we don't crash completely,
the user can still copy the log we have a better report of it. I guess
kerneloops.org is currently missing out a great deal of reports which
all shut down the machine immediately without a chance to go into the
log.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/