Re: [PATCH v2] memcg: simple cleanup of stats update functions

From: Vlastimil Babka (SUSE)
Date: Tue May 28 2024 - 04:13:10 EST


On 5/28/24 9:56 AM, Sebastian Andrzej Siewior wrote:
> On 2024-05-27 22:16:41 [-0700], Shakeel Butt wrote:
>> On Mon, May 27, 2024 at 06:34:24PM GMT, Vlastimil Babka (SUSE) wrote:
>> > On 5/27/24 5:22 PM, Sebastian Andrzej Siewior wrote:
>> > > On 2024-04-20 16:25:05 [-0700], Shakeel Butt wrote:
>> > >> mod_memcg_lruvec_state() is never called from outside of memcontrol.c
>> > >> and with always irq disabled. So, replace it with the irq disabled
>> > >> version and add an assert that irq is disabled in the caller.
>> > >
>> > > unless PREEMPT_RT is enabled. In that case IRQs are not disabled as part
>> > > of local_lock_irqsave(&memcg_stock.stock_lock, …) leading to:
>>
>> Sorry about that and thanks for the report.
>
> no worries.
>
>> >
>> > But then the "interrupts are handled by a kernel thread that can sleep" part
>> > of RT also means it's ok to just have the stock_lock taken with no
>> > interrupts disabled as no actual raw interrupt handler will interrupt the
>> > holder and deadlock, right?
>
> I *don't* know why the interrupts-disabled check is here. The
> memcg_stock.stock_lock is acquired on RT with interrupts enabled and
> never disables interrupts. The lock is never acquired in an hard
> interrupt (not threaded interrupt) context so there is never a deadlock.
>
> Originally the interrupts were disabled in mod_memcg_lruvec_state()
> because the counter, it operates on, is per-CPU and relies on disabled
> interrupts because the operation is not atomic and the code can be run
> in interrupts context (on !RT). The __mod_memcg_lruvec_state() variant
> of it relied on interrupts being disabled by the caller. This "rely on"
> was part of a spinlock_t lock (or invoked from an interrupt handler, the
> memory is fading slowly away) which does not disable interrupts on
> PREEMPT_RT.
> So for that reason we ended up with __memcg_stats_lock() which disables
> preemption only on PREEMPT_RT to achieve the same level of "atomic"
> update.
>
>> Thanks Vlastimil for jolting my memory on RT reasoning.
>>
>> > > suggestions?
>> >
>> > So in that case the appropriate thing would be to replace the assert with
>> > lockdep_assert_held(&memcg_stock.stock_lock);
>> > ?
>> >
>> > It seems all the code paths leading here have that one.
>> >
>>
>> Yeah this seems right and reasonable. Should I send a fix or you want to
>> send it?
>
> I don't mind sending a patch. I'm just not sure if the lock is the right
> thing to do. However it should ensure that interrupts are disabled on
> !RT for the sake of the counter update (if observed in IRQ context).

Looks like some places there use VM_WARN_ON_IRQS_ENABLED() that's turned off
for PREEMPT_RT, so maybe that's what should replace the current
lockdep_assert, perhaps together with
lockdep_assert_held(this_cpu_ptr(&memcg_stock.stock_lock));

But also __mod_memcg_lruvec_state() already has that VM_WARN_ON.

> Yeah, let me prepare something.
>
> Sebastian