Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible

From: Geert Uytterhoeven
Date: Wed Sep 13 2023 - 09:15:14 EST


Hi Paul,

On Tue, Sep 12, 2023 at 12:00 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> On Tue, Sep 12, 2023 at 10:34:44AM +0200, Geert Uytterhoeven wrote:
> > On Tue, Sep 12, 2023 at 10:30 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > > On Tue, Sep 12, 2023 at 10:23:37AM +0200, Geert Uytterhoeven wrote:
> > > > On Tue, Sep 12, 2023 at 10:14 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > > > > On Mon, Sep 11, 2023 at 07:54:52PM -0400, Liam R. Howlett wrote:
> > > > > > * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 14:03]:
> > > > > > > On Wed, Sep 06, 2023 at 01:29:54PM -0400, Liam R. Howlett wrote:
> > > > > > > > * Paul E. McKenney <paulmck@xxxxxxxxxx> [230906 13:24]:
> > > > > > > > > On Wed, Sep 06, 2023 at 11:23:25AM -0400, Liam R. Howlett wrote:
> > > > > > > > > > (Adding Paul & Shanker to Cc list.. please see below for why)
> > > > > > > > > >
> > > > > > > > > > Apologies on the late response, I was away and have been struggling to
> > > > > > > > > > get a working PPC32 test environment.
> > > > > > > > > >
> > > > > > > > > > * Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> [230829 12:42]:
> > > > > > > > > > > Hi Liam,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, 18 Aug 2023, Liam R. Howlett wrote:
> > > > > > > > > > > > The current implementation of append may cause duplicate data and/or
> > > > > > > > > > > > incorrect ranges to be returned to a reader during an update. Although
> > > > > > > > > > > > this has not been reported or seen, disable the append write operation
> > > > > > > > > > > > while the tree is in rcu mode out of an abundance of caution.
> > > > > > > > > >
> > > > > > > > > > ...
> > > > > > > > > > > >
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > > > > > > RCU-related configs:
> > > > > > > > > > >
> > > > > > > > > > > $ grep RCU .config
> > > > > > > > > > > # RCU Subsystem
> > > > > > > > > > > CONFIG_TINY_RCU=y
> > > > >
> > > > > I must have been asleep last time I looked at this. I was looking at
> > > > > Tree RCU. Please accept my apologies for my lapse. :-/
> > > > >
> > > > > However, Tiny RCU's call_rcu() also avoids enabling IRQs, so I would
> > > > > have said the same thing, albeit after looking at a lot less RCU code.
> > > > >
> > > > > TL;DR:
> > > > >
> > > > > 1. Try making the __setup_irq() function's call to mutex_lock()
> > > > > instead be as follows:
> > > > >
> > > > > if (!mutex_trylock(&desc->request_mutex))
> > > > > mutex_lock(&desc->request_mutex);
> > > > >
> > > > > This might fail if __setup_irq() has other dependencies on a
> > > > > fully operational scheduler.
> > > > >
> > > > > 2. Move that ppc32 call to __setup_irq() much later, most definitely
> > > > > after interrupts have been enabled and the scheduler is fully
> > > > > operational. Invoking mutex_lock() before that time is not a
> > > > > good idea. ;-)
> > > >
> > > > There is no call to __setup_irq() from arch/powerpc/?
> > >
> > > Glad it is not just me, given that I didn't see a direct call, either. So
> > > later in this email, I asked Liam to put a WARN_ON_ONCE(irqs_disabled())
> > > just before that mutex_lock() in __setup_irq().
> > >
> > > Either way, invoking mutex_lock() early in boot before interrupts have
> > > been enabled is a bad idea. ;-)
> >
> > I'll add that WARN_ON_ONCE() too, and will report back later today...
>
> Thank you, looking forward to hearing the outcome!

On a typical arm32 system with global or architectured timer, I get:

WARNING: CPU: 0 PID: 0 at kernel/irq/manage.c:1587 __setup_irq+0xcc/0x6d0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
6.6.0-rc1-shmobile-02354-g24e058b77f5a #1655
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x40/0x4c
dump_stack_lvl from __warn+0x78/0x10c
__warn from warn_slowpath_fmt+0x90/0x11c
warn_slowpath_fmt from __setup_irq+0xcc/0x6d0
__setup_irq from __request_percpu_irq+0xb8/0xd0
__request_percpu_irq from set_smp_ipi_range+0x88/0xdc
set_smp_ipi_range from gic_of_init+0x1a4/0x4d8
gic_of_init from of_irq_init+0x1f0/0x320
of_irq_init from init_IRQ+0x74/0x104
init_IRQ from start_kernel+0x360/0x5d0
start_kernel from 0x0

Likewise on arm64 with architectured timer.
But on these systems I do not see the issue I reported.

On RISC-V:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/irq/manage.c:1587 __setup_irq+0x4e6/0x5ee
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc1-02355-g63165363c6a3 #63
Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
epc : __setup_irq+0x4e6/0x5ee
ra : __setup_irq+0x38/0x5ee
epc : ffffffff80049472 ra : ffffffff80048fc4 sp : ffffffff81203cd0
gp : ffffffff812ee760 tp : ffffffff8120d5c0 t0 : ffffffd801854300
t1 : 0000000000000000 t2 : ffffffff81000ad8 s0 : ffffffff81203d20
s1 : ffffffd801855000 a0 : 0000000000000001 a1 : ffffffd801855000
a2 : ffffffd801854280 a3 : 0000000000000001 a4 : 0000000000000000
a5 : 0000000000000000 a6 : ffffffd801852108 a7 : ffffffd801852100
s2 : ffffffd801854280 s3 : 0000000000000005 s4 : ffffffff812c54c0
s5 : 0000000000000005 s6 : ffffffff80dd83a0 s7 : ffffffff805c0cc0
s8 : ffffffd801855018 s9 : 0000000000000000 s10: 0000000000000000
s11: 000000007bf638a0 t3 : 0000000000000000 t4 : 0000000000000002
t5 : ffffffff812882a0 t6 : 0000000000000001
status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff80049472>] __setup_irq+0x4e6/0x5ee
[<ffffffff800497a8>] __request_percpu_irq+0x98/0xcc
[<ffffffff8082501e>] riscv_timer_init_dt+0x186/0x22e
[<ffffffff80824b62>] timer_probe+0x62/0xd2
[<ffffffff80803c36>] time_init+0x86/0xa6
[<ffffffff80800ae2>] start_kernel+0x436/0x618
---[ end trace 0000000000000000 ]---

Also, no issue here.

On the affected systems (RZ/A1 and RZ/A2), the WARN_ON_ONCE() did
not trigger, until I applied Liam's patch ("init/main: Clear boot task
idle flag"), which got rid of the "Interrupts were enabled early" warning,
and now tells me, as expected:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/irq/manage.c:1587 __setup_irq+0xc8/0x654
CPU: 0 PID: 0 Comm: swapper Not tainted
6.6.0-rc1-rskrza1-02357-g237e09fd64b8-dirty #829
Hardware name: Generic R7S72100 (Flattened Device Tree)
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x24/0x3c
dump_stack_lvl from __warn+0x74/0xb8
__warn from warn_slowpath_fmt+0x78/0xb0
warn_slowpath_fmt from __setup_irq+0xc8/0x654
__setup_irq from request_threaded_irq+0xac/0x13c
request_threaded_irq from timer_of_init+0x238/0x2c8
timer_of_init from ostm_init+0x98/0x208
ostm_init from timer_probe+0x90/0xe4
timer_probe from start_kernel+0x2c0/0x488
start_kernel from 0x0
---[ end trace 0000000000000000 ]---

However, Liam's patch causes lots of warnings on the other systems...

> > > > Note that there are (possibly different) issues seen on ppc32 and on arm32
> > > > (Renesas RZ/A in particular, but not on other Renesas ARM systems).
> > > >
> > > > I saw an issue on arm32 with cfeb6ae8bcb96ccf, but not with cfeb6ae8bcb96ccf^.
> > > > Other people saw an issue on ppc32 with both cfeb6ae8bcb96ccf and
> > > > cfeb6ae8bcb96ccf^.
> > >
> > > I look forward to hearing what is the issue in both cases.
> >
> > For RZ/A, my problem report is
> > https://lore.kernel.org/all/3f86d58e-7f36-c6b4-c43a-2a7bcffd3bd@xxxxxxxxxxxxxx/
>
> Thank you, Geert!
>
> Huh. Is that patch you reverted causing Maple Tree or related code
> to attempt to acquire mutexes in early boot before interrupts have
> been enabled?
>
> If that added WARN_ON_ONCE() doesn't trigger early, another approach
> would be to put it at the beginning of mutex_lock(). Or for that matter
> at the beginning of might_sleep().

With the WARN_ON_ONCE() moved from __setup_irq() to mutex_lock(),
it does trigger on RZ/A1:

Dentry cache hash table entries: 4096 (order: 2, 16384 bytes, linear)
Inode-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
Built 1 zonelists, mobility grouping off. Total pages: 8128
+------------[ cut here ]------------
+WARNING: CPU: 0 PID: 0 at kernel/locking/mutex.c:283 mutex_lock+0x3c/0x68
+CPU: 0 PID: 0 Comm: swapper Not tainted
6.6.0-rc1-rskrza1-02360-gd762084b1737-dirty #831
+Hardware name: Generic R7S72100 (Flattened Device Tree)
+ unwind_backtrace from show_stack+0x10/0x14
+ show_stack from dump_stack_lvl+0x24/0x3c
+ dump_stack_lvl from __warn+0x74/0xb8
+ __warn from warn_slowpath_fmt+0x78/0xb0
+ warn_slowpath_fmt from mutex_lock+0x3c/0x68
+ mutex_lock from __cpuhp_setup_state_cpuslocked+0x44/0x1c0
+ __cpuhp_setup_state_cpuslocked from page_alloc_init_cpuhp+0x28/0x64
+ page_alloc_init_cpuhp from mm_core_init+0x18/0x2a4
+ mm_core_init from start_kernel+0x250/0x47c
+ start_kernel from 0x0
+---[ end trace 0000000000000000 ]---
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 22876K/32768K available (5120K kernel code, 951K rwdata,
1208K rodata, 1024K init, 321K bss, 9892K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-1, MinObjects=0, CPUs=1, Nodes=1

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds