Re: [PATCH v3 05/12] arm64: csum: Disable KASAN for do_csum()

From: Arnd Bergmann
Date: Wed Apr 15 2020 - 15:13:13 EST


On Wed, Apr 15, 2020 at 7:28 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> Hi Will,
>
> On Wed, Apr 15, 2020 at 05:52:11PM +0100, Will Deacon wrote:
> > do_csum() over-reads the source buffer and therefore abuses
> > READ_ONCE_NOCHECK() to avoid tripping up KASAN. In preparation for
> > READ_ONCE_NOCHECK() becoming a macro, and therefore losing its
> > '__no_sanitize_address' annotation, just annotate do_csum() explicitly
> > and fall back to normal loads.
>
> I'm confused by this. The whole point of READ_ONCE_NOCHECK() is that it
> isn't checked by KASAN, so if that semantic is removed it has no reason
> to exist.
>
> Changing that will break the unwind/stacktrace code across multiple
> architectures. IIRC they use READ_ONCE_NOCHECK() for two reasons:
>
> 1. Races with concurrent modification, as might happen when a thread's
> stack is corrupted. Allowing the unwinder to bail out after a sanity
> check means the resulting report is more useful than a KASAN splat in
> the unwinder. I made the arm64 unwinder robust to this case.
>
> 2. I believe that the frame record itself /might/ be poisoned by KASAN,
> since it's not meant to be an accessible object at the C langauge
> level. I could be wrong about this, and would have to check.

I thought the main reason was deadlocks when a READ_ONCE()
is called inside of code that is part of the KASAN handling. If
READ_ONCE() ends up recursively calling itself, the kernel
tends to crash once it overflows its stack.

> I would like to keep the unwinding robust in the first case, even if the
> second case doesn't apply, and I'd prefer to not mark the entirety of
> the unwinding code as unchecked as that's sufficiently large an subtle
> that it could have nasty bugs.
>
> Is there any way we keep something like READ_ONCE_NOCHECK() around even
> if we have to give it reduced functionality relative to READ_ONCE()?
>
> I'm not enirely sure why READ_ONCE_NOCHECK() had to go, so if there's a
> particular pain point I'm happy to take a look.

As I understood, only this particular instance was removed, not all of them.

Arnd