Re: [PATCH] mm, kasan: introduce a special shadow value for allocator metadata

From: Andrey Ryabinin
Date: Wed Jun 01 2016 - 11:23:29 EST


On 05/31/2016 08:49 PM, Alexander Potapenko wrote:
> On Tue, May 31, 2016 at 1:52 PM, Andrey Ryabinin
> <aryabinin@xxxxxxxxxxxxx> wrote:
>>
>>
>> On 05/31/2016 01:44 PM, Alexander Potapenko wrote:
>>> Add a special shadow value to distinguish accesses to KASAN-specific
>>> allocator metadata.
>>>
>>> Unlike AddressSanitizer in the userspace, KASAN lets the kernel proceed
>>> after a memory error. However a write to the kmalloc metadata may cause
>>> memory corruptions that will make the tool itself unreliable and induce
>>> crashes later on. Warning about such corruptions will ease the
>>> debugging.
>>
>> It will not. Whether out-of-bounds hits metadata or not is absolutely irrelevant
>> to the bug itself. This information doesn't help to understand, analyze or fix the bug.
>>
> Here's the example that made me think the opposite.
>
> I've been reworking KASAN hooks for mempool and added a test that did
> a write-after-free to an object allocated from a mempool.
> This resulted in flaky kernel crashes somewhere in quarantine
> shrinking after several attempts to `insmod test_kasan.ko`.
> Because there already were numerous KASAN errors in the test, it
> wasn't evident that the crashes were related to the new test, so I
> thought the problem was in the buggy quarantine implementation.
> However the problem was indeed in the new test, which corrupted the
> quarantine pointer in the object and caused a crash while traversing
> the quarantine list.
>
> My previous experience with userspace ASan shows that crashes in the
> tool code itself puzzle the developers.
> As a result, the users think that the tool is broken and don't believe
> its reports.
>
> I first thought about hardening the quarantine list by checksumming
> the pointers and validating them on each traversal.
> This prevents the crashes, but doesn't give the users any idea about
> what went wrong.
> On the other hand, reporting the pointer corruption right when it happens does.
> Distinguishing between a regular UAF and a quarantine corruption
> (which is what the patch in question is about) helps to prioritize the
> KASAN reports and give the developers better understanding of the
> consequences.
>

After the first report we have memory in a corrupted state, so we are done here.
Anything that happens after the first report can't be trusted since it can be an after-effect,
just like in your case. Such crashes are not worthy to look at.
Out-of-bounds that doesn't hit metadata as any other memory corruption also can lead to after-effects crashes,
thus distinguishing such bugs doesn't make a lot of sense.

test_kasan module is just a quick hack, made only to make sure that KASAN works.
It does some crappy thing, and may lead to crash as well. So I would recommend an immediate
reboot even after single attempt to load it.