Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled

From: Catalin Marinas
Date: Mon Mar 10 2025 - 14:40:52 EST


On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > this is incompatible with MTE which will fault on a partially invalid
> > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > bug for now by disabling the optimizations if the kernel is built
> > > > with HW tag-based KASAN and consider improvements for followup changes.
> > >
> > > Why is faulting on a partially invalid read a problem? It's still
> > > invalid, so ... it should fault, yes? What am I missing?
> >
> > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > beyond the end of string. The has_zero() function is then used to check
> > where the string ends. For this uses, I think we can go with
> > load_unaligned_zeropad() which handles a potential fault and pads the
> > rest with zeroes.
>
> If we only care about synchronous and asymmetric modes, that should be
> possible, but that won't work in asynchronous mode. In asynchronous mode
> the fault will accumulate into TFSR and will be detected later
> asynchronously where it cannot be related to its source and fixed up.
>
> That means that both read_word_at_a_time() and load_unaligned_zeropad()
> are dodgy in async mode.

load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
indeed busted and I've had Vincezo's patches for a couple of years
already, they just never made it to the list.

> Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?

We could, though that was mostly for user-space faults while in-kernel
we'd only need something similar if KASAN_HW_TAGS.

> ... and is there anything else that deliberately makes accesses that
> could straddle objects?

So far we only came across load_unaligned_zeropad() and
read_word_at_a_time(). I'm not aware of anything else.

--
Catalin