Re: [RFC][PATCH 7/7] kref: Implement using refcount_t

From: Kees Cook
Date: Wed Nov 16 2016 - 13:55:30 EST

On Wed, Nov 16, 2016 at 2:15 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, Nov 16, 2016 at 09:31:55AM +0100, Ingo Molnar wrote:
>> * Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> > On Tue, Nov 15, 2016 at 11:16 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > >
>> > >
>> > > On 15 November 2016 19:06:28 CET, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> > >
>> > >>I'll want to modify this in the future; I have a config already doing
>> > >>"Bug on data structure corruption" that makes the warn/bug choice.
>> > >>It'll need some massaging to fit into the new refcount_t checks, but
>> > >>it should be okay -- there needs to be a way to complete the
>> > >>saturation, etc, but still kill the offending process group.
>> > >
>> > > Ideally we'd create a new WARN like construct that continues in kernel space
>> > > and terminates the process on return to user. That way there would be minimal
>> > > kernel state corruption.
>> Yeah, so the problem is that sometimes you are p0wned the moment you return to a
>> corrupted stack, and some of these checks only detect corruption after the fact.
> So the case here is about refcounts, with the saturation semantics we
> avoid the use-after-free case which is all this is about. So actually
> continuation of execution is harmless vs the attack vector in question.
> Corrupting the stack is another attack vector, one that refcount
> overflow is entirely unrelated to and not one I think we should consider
> here.
> The problem with BUG and insta killing the task is that refcounts are
> typically done under locks, if you kill the task before the unlock,
> you've wrecked kernel state in unrecoverable ways.

My intention with what I'm designing is to couple the "panic_on_oops"
sysctl logic with a "kernel structure corruption has been detected"
warning. That way, one can select, at runtime, if the kernel should
panic instantly on hitting this, or just do its best to clean things
up and kill the process. There basically isn't a use-case for BUG in
this situation. Either you're risk-averse enough to want to take the
entire machine down, or you want to kill the offending process and
clean up to continue running.

I'm still evolving how to best do it, and right now it's a rather
large hammer (now controlled by a CONFIG called
CONFIG_BUG_ON_DATA_CORRUPTION in -next but it will likely disappear
entirely as its design has evolved). I intend to improve it first and
then expand its coverage in the kernel. It requires extracting some of
the per-arch BUG logic into a real kernel API, and combining it with
existing pieces of the WARN API.


Kees Cook
Nexus Security