Re: [PATCH] lib/strscpy: avoid KASAN false positive
From: Dmitry Vyukov
Date: Wed Jul 19 2017 - 03:46:42 EST
On Wed, Jul 19, 2017 at 12:35 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Jul 18, 2017 at 3:04 PM, Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> We could just remove all that word-at-a-time logic. Do we have any
>> evidence that this would harm anything?
>
> One option would be to simply make reads more permissive, and only
> check the first byte of the read (unlike the last, which is what it
> does now). Obviously within the normal KASAN granularity level (which
> I think is 8 byte aligned anyway).
>
> Checking *writes* more carefully is obviously a good idea regardless.
> But the whole "do big reads" is pretty common for a lot of memory and
> string copies.
>
> So maybe the whole KASAN_SHADOW_MASK thing could be limited to just
> the write side?
>
> That would certainly simplify things. How much it would hurt coverage
> would be something that the people who have analyzed KASAN reports so
> far would have to judge. Have any of the existing KASAN reports been
> due to the KASAN_SHADOW_SCALE level read tests?
We've seen a bunch of such out-of-bounds KASAN reports -- on strings,
network packets, other binary data. I would be very much against
relaxing the general C rules for the two cases of formally incorrect C
code.
I see 3 ways out:
1. What Andrew mentioned -- don't do out of bounds reads. Solves the
problem with perfectly clean code.
I guess now people may ask for benchmarks for the change that removes
the optimization. But looking at the change that introduced it, it was
never benchmarked in the first place. And the description actually
explicitly says that we do this just-in-case and that this is not
performance critical:
- The implementation should be reasonably performant on all
platforms since it uses the asm/word-at-a-time.h API rather than
simple byte copy. Kernel-to-kernel string copy is not considered
to be performance critical in any case.
2. Assuming we want to have both out-of-bounds reads and KASAN, deal
with it on a case-by-base basis.
For strscpy, not checking accesses at all looks like the worst option
because we will miss off-by-page accesses and use-after-free's (as
Andrey already mentioned).
Introducing read_partial_nocheck looks like the second worst, because
it introduces a one off interface routine.
Between setting max=0 and using READ_ONCE_NOCHECK+kasan_check_read I
don't have a strong preference. Setting max=0 looks simpler and less
intrusive, though.
And, yes, what we do in dcache looks worse because (1), it's
higher-level code rather than a low-level routine (and we generally
don't want to sprinkle KASAN-anything in high-level code) and (2)
unpoisoning the trailer unpoisons it for all accesses (even the ones
that unintentionally access out-of-bounds).
3. Adapt kernel memory protection model. Currently it assumes that an
arch provides protection only on page granularity. But KASAN is
effectively an arch with ultimate segmentation where each heap block
gets an own segment with precise size (equivalent to the abstract C
machine as well). So we could introduce PROTECTION_GRANULARITY
constant which is usually PAGE_SIZE, but 1 for KASAN. Then we can
write clean code based on this model. But I am not sure we want to go
this route while we have only 2 cases. And effectively we will still
have different code executed under KASAN.