Re: [PATCH 21/35] arm64: mte: Add in-kernel tag fault handler

From: Catalin Marinas
Date: Thu Aug 27 2020 - 10:56:58 EST


On Thu, Aug 27, 2020 at 03:34:42PM +0200, Andrey Konovalov wrote:
> On Thu, Aug 27, 2020 at 3:10 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > On Thu, Aug 27, 2020 at 02:31:23PM +0200, Andrey Konovalov wrote:
> > > On Thu, Aug 27, 2020 at 11:54 AM Catalin Marinas
> > > <catalin.marinas@xxxxxxx> wrote:
> > > > On Fri, Aug 14, 2020 at 07:27:03PM +0200, Andrey Konovalov wrote:
> > > > > +static int do_tag_recovery(unsigned long addr, unsigned int esr,
> > > > > + struct pt_regs *regs)
> > > > > +{
> > > > > + report_tag_fault(addr, esr, regs);
> > > > > +
> > > > > + /* Skip over the faulting instruction and continue: */
> > > > > + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
> > > >
> > > > Ooooh, do we expect the kernel to still behave correctly after this? I
> > > > thought the recovery means disabling tag checking altogether and
> > > > restarting the instruction rather than skipping over it.
[...]
> > > Can we disable MTE, reexecute the instruction, and then reenable MTE,
> > > or something like that?
> >
> > If you want to preserve the MTE enabled, you could single-step the
> > instruction or execute it out of line, though it's a bit more convoluted
> > (we have a similar mechanism for kprobes/uprobes).
> >
> > Another option would be to attempt to set the matching tag in memory,
> > under the assumption that it is writable (if it's not, maybe it's fine
> > to panic). Not sure how this interacts with the slub allocator since,
> > presumably, the logical tag in the pointer is wrong rather than the
> > allocation one.
> >
> > Yet another option would be to change the tag in the register and
> > re-execute but this may confuse the compiler.
>
> Which one of these would be simpler to implement?

Either 2 or 3 would be simpler (re-tag the memory location or the
pointer) with the caveats I mentioned. Also, does the slab allocator
need to touch the memory on free with a tagged pointer? Otherwise slab
may hit an MTE fault itself.

> Perhaps we could somehow only skip faulting instructions that happen
> in the KASAN test module?.. Decoding stack trace would be an option,
> but that's a bit weird.

If you want to restrict this to the KASAN tests, just add some
MTE-specific accessors with a fixup entry similar to get_user/put_user.
__do_kernel_fault() (if actually called) will invoke the fixup code
which skips the access and returns an error. This way KASAN tests can
actually verify that tag checking works, I'd find this a lot more
useful.

--
Catalin