Re: [PATCH v3 2/6] exit: Put an upper limit on how often we can oops

From: SeongJae Park
Date: Thu Jan 19 2023 - 23:29:04 EST


Hello,

On Thu, 17 Nov 2022 15:43:22 -0800 Kees Cook <keescook@xxxxxxxxxxxx> wrote:

> From: Jann Horn <jannh@xxxxxxxxxx>
>
> Many Linux systems are configured to not panic on oops; but allowing an
> attacker to oops the system **really** often can make even bugs that look
> completely unexploitable exploitable (like NULL dereferences and such) if
> each crash elevates a refcount by one or a lock is taken in read mode, and
> this causes a counter to eventually overflow.
>
> The most interesting counters for this are 32 bits wide (like open-coded
> refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
> platforms is just 16 bits, but probably nobody cares about 32-bit platforms
> that much nowadays.)
>
> So let's panic the system if the kernel is constantly oopsing.
>
> The speed of oopsing 2^32 times probably depends on several factors, like
> how long the stack trace is and which unwinder you're using; an empirically
> important one is whether your console is showing a graphical environment or
> a text console that oopses will be printed to.
> In a quick single-threaded benchmark, it looks like oopsing in a vfork()
> child with a very short stack trace only takes ~510 microseconds per run
> when a graphical console is active; but switching to a text console that
> oopses are printed to slows it down around 87x, to ~45 milliseconds per
> run.
> (Adding more threads makes this faster, but the actual oops printing
> happens under &die_lock on x86, so you can maybe speed this up by a factor
> of around 2 and then any further improvement gets eaten up by lock
> contention.)
>
> It looks like it would take around 8-12 days to overflow a 32-bit counter
> with repeated oopsing on a multi-core X86 system running a graphical
> environment; both me (in an X86 VM) and Seth (with a distro kernel on
> normal hardware in a standard configuration) got numbers in that ballpark.
>
> 12 days aren't *that* short on a desktop system, and you'd likely need much
> longer on a typical server system (assuming that people don't run graphical
> desktop environments on their servers), and this is a *very* noisy and
> violent approach to exploiting the kernel; and it also seems to take orders
> of magnitude longer on some machines, probably because stuff like EFI
> pstore will slow it down a ton if that's active.

I found a blog article[1] recommending LTS kernels to backport this as below.

While this patch is already upstream, it is important that distributed
kernels also inherit this oops limit and backport it to LTS releases if we
want to avoid treating such null-dereference bugs as full-fledged security
issues in the future.

Do you have a plan to backport this into upstream LTS kernels?

[1] https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences-in-linux.html


Thanks,
SJ

>
> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@xxxxxxxxxx
> Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>