Re: [PATCH v2] kmemleak: Turn kmemleak_lock to raw spinlock on RT

From: Sebastian Andrzej Siewior
Date: Fri Nov 23 2018 - 04:53:26 EST


On 2018-11-22 17:04:19 [+0800], zhe.he@xxxxxxxxxxxxx wrote:
> From: He Zhe <zhe.he@xxxxxxxxxxxxx>
>
> kmemleak_lock, as a rwlock on RT, can possibly be held in atomic context and
> causes the follow BUG.
>
> BUG: scheduling while atomic: migration/15/132/0x00000002
â
> Preemption disabled at:
> [<ffffffff8c927c11>] cpu_stopper_thread+0x71/0x100
> CPU: 15 PID: 132 Comm: migration/15 Not tainted 4.19.0-rt1-preempt-rt #1
> Hardware name: Intel Corp. Harcuvar/Server, BIOS HAVLCRB1.X64.0015.D62.1708310404 08/31/2017
> Call Trace:
> dump_stack+0x4f/0x6a
> ? cpu_stopper_thread+0x71/0x100
> __schedule_bug.cold.16+0x38/0x55
> __schedule+0x484/0x6c0
> schedule+0x3d/0xe0
> rt_spin_lock_slowlock_locked+0x118/0x2a0
> rt_spin_lock_slowlock+0x57/0x90
> __rt_spin_lock+0x26/0x30
> __write_rt_lock+0x23/0x1a0
> ? intel_pmu_cpu_dying+0x67/0x70
> rt_write_lock+0x2a/0x30
> find_and_remove_object+0x1e/0x80
> delete_object_full+0x10/0x20
> kmemleak_free+0x32/0x50
> kfree+0x104/0x1f0
> ? x86_pmu_starting_cpu+0x30/0x30
> intel_pmu_cpu_dying+0x67/0x70
> x86_pmu_dying_cpu+0x1a/0x30
> cpuhp_invoke_callback+0x92/0x700
> take_cpu_down+0x70/0xa0
> multi_cpu_stop+0x62/0xc0
> ? cpu_stop_queue_work+0x130/0x130
> cpu_stopper_thread+0x79/0x100
> smpboot_thread_fn+0x20f/0x2d0
> kthread+0x121/0x140
> ? sort_range+0x30/0x30
> ? kthread_park+0x90/0x90
> ret_from_fork+0x35/0x40

If this is the only problem? kfree() from a preempt-disabled section
should cause a warning even without kmemleak.

> And on v4.18 stable tree the following call trace, caused by grabbing
> kmemleak_lock again, is also observed.
>
> kernel BUG at kernel/locking/rtmutex.c:1048!
> invalid opcode: 0000 [#1] PREEMPT SMP PTI
> CPU: 5 PID: 689 Comm: mkfs.ext4 Not tainted 4.18.16-rt9-preempt-rt #1
â
> Call Trace:
> ? preempt_count_add+0x74/0xc0
> rt_spin_lock_slowlock+0x57/0x90
> ? __kernel_text_address+0x12/0x40
> ? __save_stack_trace+0x75/0x100
> __rt_spin_lock+0x26/0x30
> __write_rt_lock+0x23/0x1a0
> rt_write_lock+0x2a/0x30
> create_object+0x17d/0x2b0
â

is this an RT-only problem? Because mainline should not allow read->read
locking or read->write locking for reader-writer locks. If this only
happens on v4.18 and not on v4.19 then something must have fixed it.


Sebastian