Re: [PATCH] string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled

From: Mark Rutland
Date: Mon Mar 10 2025 - 15:38:00 EST


On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote:
> On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > > this is incompatible with MTE which will fault on a partially invalid
> > > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > > bug for now by disabling the optimizations if the kernel is built
> > > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > >
> > > > Why is faulting on a partially invalid read a problem? It's still
> > > > invalid, so ... it should fault, yes? What am I missing?
> > >
> > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > > beyond the end of string. The has_zero() function is then used to check
> > > where the string ends. For this uses, I think we can go with
> > > load_unaligned_zeropad() which handles a potential fault and pads the
> > > rest with zeroes.
> >
> > If we only care about synchronous and asymmetric modes, that should be
> > possible, but that won't work in asynchronous mode. In asynchronous mode
> > the fault will accumulate into TFSR and will be detected later
> > asynchronously where it cannot be related to its source and fixed up.
> >
> > That means that both read_word_at_a_time() and load_unaligned_zeropad()
> > are dodgy in async mode.
>
> load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
> PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
> indeed busted and I've had Vincezo's patches for a couple of years
> already, they just never made it to the list.

Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as
we're happy to omit the check in that case, that's fine.

I was worried that ex_handler_load_unaligned_zeropad() might not do the
right thing in response to a tag check fault (e.g. access the wrong 8
bytes), but it looks as though that's ok due to the way it generates the
offset and the aligned pointer.

If load_unaligned_zeropad() is handed a string that starts with an
unexpected tag (and even if that starts off aligned),
ex_handler_load_unaligned_zeropad() will access that and cause another
tag check fault, which will be reported.

Mark.