Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible

From: Geert Uytterhoeven
Date: Tue Sep 12 2023 - 04:35:21 EST


Hi Paul,

On Tue, Sep 12, 2023 at 10:30 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> On Tue, Sep 12, 2023 at 10:23:37AM +0200, Geert Uytterhoeven wrote:
> > On Tue, Sep 12, 2023 at 10:14 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > > On Mon, Sep 11, 2023 at 07:54:52PM -0400, Liam R. Howlett wrote:
> > > > * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 14:03]:
> > > > > On Wed, Sep 06, 2023 at 01:29:54PM -0400, Liam R. Howlett wrote:
> > > > > > * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 13:24]:
> > > > > > > On Wed, Sep 06, 2023 at 11:23:25AM -0400, Liam R. Howlett wrote:
> > > > > > > > (Adding Paul & Shanker to Cc list.. please see below for why)
> > > > > > > >
> > > > > > > > Apologies on the late response, I was away and have been struggling to
> > > > > > > > get a working PPC32 test environment.
> > > > > > > >
> > > > > > > > * Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> [230829 12:42]:
> > > > > > > > > Hi Liam,
> > > > > > > > >
> > > > > > > > > On Fri, 18 Aug 2023, Liam R. Howlett wrote:
> > > > > > > > > > The current implementation of append may cause duplicate data and/or
> > > > > > > > > > incorrect ranges to be returned to a reader during an update. Although
> > > > > > > > > > this has not been reported or seen, disable the append write operation
> > > > > > > > > > while the tree is in rcu mode out of an abundance of caution.
> > > > > > > >
> > > > > > > > ...
> > > > > > > > > >
> > > >
> > > > ...
> > > >
> > > > > > > > > RCU-related configs:
> > > > > > > > >
> > > > > > > > > $ grep RCU .config
> > > > > > > > > # RCU Subsystem
> > > > > > > > > CONFIG_TINY_RCU=y
> > >
> > > I must have been asleep last time I looked at this. I was looking at
> > > Tree RCU. Please accept my apologies for my lapse. :-/
> > >
> > > However, Tiny RCU's call_rcu() also avoids enabling IRQs, so I would
> > > have said the same thing, albeit after looking at a lot less RCU code.
> > >
> > > TL;DR:
> > >
> > > 1. Try making the __setup_irq() function's call to mutex_lock()
> > > instead be as follows:
> > >
> > > if (!mutex_trylock(&desc->request_mutex))
> > > mutex_lock(&desc->request_mutex);
> > >
> > > This might fail if __setup_irq() has other dependencies on a
> > > fully operational scheduler.
> > >
> > > 2. Move that ppc32 call to __setup_irq() much later, most definitely
> > > after interrupts have been enabled and the scheduler is fully
> > > operational. Invoking mutex_lock() before that time is not a
> > > good idea. ;-)
> >
> > There is no call to __setup_irq() from arch/powerpc/?
>
> Glad it is not just me, given that I didn't see a direct call, either. So
> later in this email, I asked Liam to put a WARN_ON_ONCE(irqs_disabled())
> just before that mutex_lock() in __setup_irq().
>
> Either way, invoking mutex_lock() early in boot before interrupts have
> been enabled is a bad idea. ;-)

I'll add that WARN_ON_ONCE() too, and will report back later today...

> > Note that there are (possibly different) issues seen on ppc32 and on arm32
> > (Renesas RZ/A in particular, but not on other Renesas ARM systems).
> >
> > I saw an issue on arm32 with cfeb6ae8bcb96ccf, but not with cfeb6ae8bcb96ccf^.
> > Other people saw an issue on ppc32 with both cfeb6ae8bcb96ccf and
> > cfeb6ae8bcb96ccf^.
>
> I look forward to hearing what is the issue in both cases.

For RZ/A, my problem report is
https://lore.kernel.org/all/3f86d58e-7f36-c6b4-c43a-2a7bcffd3bd@xxxxxxxxxxxxxx/

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds