Re: [kernel-hardening] [PATCH 0/2] introduce post-init read-only memory
From: Andy Lutomirski
Date: Thu Nov 26 2015 - 11:12:09 EST
On Thu, Nov 26, 2015 at 12:54 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * PaX Team <pageexec@xxxxxxxxxxx> wrote:
>
>> On 25 Nov 2015 at 10:13, Mathias Krause wrote:
>>
>> > I myself had some educating experience seeing my machine triple fault
>> > when resuming from a S3 sleep. The root cause was a variable that was
>> > annotated __read_only but that was (unnecessarily) modified during CPU
>> > bring-up phase. Debugging that kind of problems is sort of a PITA, you
>> > could imagine.
>
> ( Sidenote: I don't think a ro-faults typically result in triple faults, but yeah,
> even having a regular oops (followed by a hang or reboot) during such an
> undebuggable state of the system is a major PITA. )
>
>> actually the kernel could silently recover from this given how the page fault
>> handler could easily determine that the fault address fell into the
>> data..read_only section and just silently undo the read-only property, log the
>> event to dmesg and retry the faulting access.
>
> So a safer method would be to decode the faulting instruction, to skip it by
> fixing up the return RIP and to log the event. It would be mostly equivalent to
> trying to write to ROM (which get ignored as well), so it's a recoverable (and
> debuggable) event.
>
> We have all the necessary code in place in the kprobes code, see
> arch/x86/lib/insn.c, it's a simplified x86 decoder that knows about instruction
> length (but not about semantics).
>
> Simple skipping plus setting arithmetic flags to init value should be enough I
> think: I don't think we use fancy instructions to write to ro variables, such as
> PUSH/POP with other side effects. If such instructions exist we could minimally
> extend the decoder to do those fixups as well - in addition to double checking
> that we skip simple instructions only with no side effects.
>
> Can you see any fragility in such a technique?
>
After Linus shot down my rdmsr/rwmsr decoding patch, good luck...
More seriously, though, I think this is mostly just like any other
in-kernel fault. We failed, me might be under attack, let's oops. In
the particular case of suspend/resume, we could consider a debug flag
to allow writes to these variables during suspend/resume. In fact,
that might even be a reasonable default. We might want to allow
writes during module unload as well.
For everything else, we should probably focus more on getting OOPSes
to display reliably, which is supposed to work but, on my shiny new
i915-based laptop, is clearly not ready yet (I oopsed it yesterday due
to my own bug and all I had to show for it was a blinking capslock
key, and yes, modesetting works).
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/