Re: [PATCH] cgroup: use irqsave in cgroup_rstat_flush_locked()

From: Sebastian Andrzej Siewior
Date: Tue Jul 03 2018 - 17:35:46 EST


On 2018-07-03 13:24:24 [-0700], Tejun Heo wrote:
> (cc'ing Peter and Ingo for lockdep)
>
> Hello, Sebastian.
Hi Tejun,

> On Tue, Jul 03, 2018 at 06:45:44PM +0200, Sebastian Andrzej Siewior wrote:
> > All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock
> > either with spin_lock_irq() or spin_lock_irqsave().
>
> So, irq is always disabled in cgroup_rstat_flush_locked().

on not RT enabled kernels. On RT enabled kernels spin_lock_irq.*() is
turned into a sleeping spinlock which do not disable interrupts.

> > cgroup_rstat_flush_locked() itself acquires cgroup_rstat_cpu_lock which
> > is a raw_spin_lock. This lock is also acquired in cgroup_rstat_updated()
> > in IRQ context and therefore requires _irqsave() locking suffix in
> > cgroup_rstat_flush_locked().
>
> Yes, the cpu locks should be irqsafe too; however, as irq is always
> disabled in that function, save/restore is redundant, no?

as I pointed out above only the raw_spin_lock_t really disables
interrupts on -RT. That is the difference between those two.

> > Since there is no difference between spin_lock_t and raw_spin_lock_t
> > on !RT lockdep does not complain here. On RT lockdep complains because
> > the interrupts were not disabled here and a deadlock is possible.
>
> We at least used to do this in the kernel - manipulating irqsafe locks
> with spin_lock/unlock() if the irq state is known, whether enabled or
> disabled, and ISTR lockdep being smart enough to track actual irq
> state to determine irq safety. Am I misremembering or is this
> different on RT kernels?

No, this is correct. So on !RT kernels the spin_lock_irq() disables
interrupts and the raw_spin_lock() has the interrupts already disabled,
everything is good. On RT kernels the spin_lock_irq() does not disable
interrupts and the raw_spin_lock() acquires the lock with enabled
interrupts and lockdep complains properly.
lockdep sees the hardirq path via:

{IN-HARDIRQ-W} state was registered at:
lock_acquire+0x9e/0x250
_raw_spin_lock_irqsave+0x38/0x50
cgroup_rstat_updated+0x57/0x100
cgroup_base_stat_cputime_account_end.isra.6+0x17/0x60
__cgroup_account_cputime_field+0x49/0x60
account_system_index_time+0xdb/0x1f0
account_system_time+0x3f/0x70
account_process_tick+0x59/0x80
update_process_times+0x1d/0x50
tick_sched_handle+0x20/0x60
tick_sched_timer+0x37/0x80
__hrtimer_run_queues+0x12c/0x6d0
hrtimer_interrupt+0xed/0x240
smp_apic_timer_interrupt+0x89/0x3c0
apic_timer_interrupt+0xf/0x20
pin_current_cpu+0xa/0x120
migrate_disable+0x9a/0x200
rt_spin_lock+0x1d/0x60
put_unused_fd+0x2c/0x50
do_sys_open+0x23a/0x250
__x64_sys_openat+0x1b/0x20
do_syscall_64+0x50/0x190
entry_SYSCALL_64_after_hwframe+0x49/0xbe

> Thanks.

Sebastian