Re: [PATCH v6 15/18] khwasan, arm64: add brk handler for inline instrumentation
From: Dmitry Vyukov
Date: Thu Sep 13 2018 - 04:37:25 EST
On Wed, Sep 12, 2018 at 7:39 PM, Jann Horn <jannh@xxxxxxxxxx> wrote:
> On Wed, Sep 12, 2018 at 7:16 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> On Wed, Aug 29, 2018 at 1:35 PM, Andrey Konovalov <andreyknvl@xxxxxxxxxx> wrote:
> [...]
>> > +static int khwasan_handler(struct pt_regs *regs, unsigned int esr)
>> > +{
>> > + bool recover = esr & KHWASAN_ESR_RECOVER;
>> > + bool write = esr & KHWASAN_ESR_WRITE;
>> > + size_t size = KHWASAN_ESR_SIZE(esr);
>> > + u64 addr = regs->regs[0];
>> > + u64 pc = regs->pc;
>> > +
>> > + if (user_mode(regs))
>> > + return DBG_HOOK_ERROR;
>> > +
>> > + kasan_report(addr, size, write, pc);
>> > +
>> > + /*
>> > + * The instrumentation allows to control whether we can proceed after
>> > + * a crash was detected. This is done by passing the -recover flag to
>> > + * the compiler. Disabling recovery allows to generate more compact
>> > + * code.
>> > + *
>> > + * Unfortunately disabling recovery doesn't work for the kernel right
>> > + * now. KHWASAN reporting is disabled in some contexts (for example when
>> > + * the allocator accesses slab object metadata; same is true for KASAN;
>> > + * this is controlled by current->kasan_depth). All these accesses are
>> > + * detected by the tool, even though the reports for them are not
>> > + * printed.
>> > + *
>> > + * This is something that might be fixed at some point in the future.
>> > + */
>> > + if (!recover)
>> > + die("Oops - KHWASAN", regs, 0);
>>
>> Why die and not panic? Die seems to be much less used function, and it
>> calls panic anyway, and we call panic in kasan_report if panic_on_warn
>> is set.
>
> die() is vaguely equivalent to BUG(); die() and BUG() normally only
> terminate the current process, which may or may not leave the system
> somewhat usable, while panic() always brings down the whole system.
> AFAIK panic() shouldn't be used unless you're in some very low-level
> code where you know that trying to just kill the current process can't
> work and the entire system is broken beyond repair.
>
> If KASAN traps on some random memory access, there's a good chance
> that just killing the current process will allow at least parts of the
> system to continue. I'm not sure whether BUG() or die() is more
> appropriate here, but I think it definitely should not be a panic().
Nick, do you know if die() will be enough to catch problems on Android
phones? panic_on_warn would turn this into panic, but I guess one does
not want panic_on_warn on a canary phone.