Re: [PATCH v3 26/39] arm64: mte: Add in-kernel tag fault handler

From: Catalin Marinas
Date: Fri Sep 25 2020 - 07:47:11 EST


On Fri, Sep 25, 2020 at 01:26:02PM +0200, Andrey Konovalov wrote:
> On Fri, Sep 25, 2020 at 12:49 PM Catalin Marinas
> <catalin.marinas@xxxxxxx> wrote:
> > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > index a3bd189602df..d110f382dacf 100644
> > > --- a/arch/arm64/mm/fault.c
> > > +++ b/arch/arm64/mm/fault.c
> > > @@ -33,6 +33,7 @@
> > > #include <asm/debug-monitors.h>
> > > #include <asm/esr.h>
> > > #include <asm/kprobes.h>
> > > +#include <asm/mte.h>
> > > #include <asm/processor.h>
> > > #include <asm/sysreg.h>
> > > #include <asm/system_misc.h>
> > > @@ -294,6 +295,11 @@ static void die_kernel_fault(const char *msg, unsigned long addr,
> > > do_exit(SIGKILL);
> > > }
> > >
> > > +static void report_tag_fault(unsigned long addr, unsigned int esr,
> > > + struct pt_regs *regs)
> > > +{
> > > +}
> >
> > Do we need to introduce report_tag_fault() in this patch? It's fine but
> > add a note in the commit log that it will be populated in a subsequent
> > patch.
>
> I did, see the last line of the commit description.

Sorry, I missed that.

> > > +
> > > static void __do_kernel_fault(unsigned long addr, unsigned int esr,
> > > struct pt_regs *regs)
> > > {
> > > @@ -641,10 +647,40 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
> > > return 0;
> > > }
> > >
> > > +static void do_tag_recovery(unsigned long addr, unsigned int esr,
> > > + struct pt_regs *regs)
> > > +{
> > > + static bool reported = false;
> > > +
> > > + if (!READ_ONCE(reported)) {
> > > + report_tag_fault(addr, esr, regs);
> > > + WRITE_ONCE(reported, true);
> > > + }
> >
> > I don't mind the READ_ONCE/WRITE_ONCE here but not sure what they help
> > with.
>
> The fault can happen on multiple cores at the same time, right? In
> that case without READ/WRITE_ONCE() we'll have a data-race here.

READ/WRITE_ONCE won't magically solve such races. If two CPUs enter
simultaneously in do_tag_recovery(), they'd both read 'reported' as
false and both print the fault info.

If you really care about this race, you need to atomically both read and
update the variable with an xchg() or cmpxchg().

--
Catalin