Re: [PATCH] mm/damon/tests/vaddr-kunit: don't use mas_lock for MM_MT_FLAGS-initialized maple tree

From: Guenter Roeck
Date: Wed Sep 04 2024 - 15:56:45 EST


On 9/4/24 12:26, Liam R. Howlett wrote:
* Guenter Roeck <linux@xxxxxxxxxxxx> [240904 00:27]:
On 9/3/24 20:36, Liam R. Howlett wrote:
* Guenter Roeck <linux@xxxxxxxxxxxx> [240903 22:38]:
On 9/3/24 19:31, Liam R. Howlett wrote:
* SeongJae Park <sj@xxxxxxxxxx> [240903 21:18]:
On Tue, 3 Sep 2024 17:58:15 -0700 SeongJae Park <sj@xxxxxxxxxx> wrote:

On Tue, 3 Sep 2024 20:48:53 -0400 "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx> wrote:

* SeongJae Park <sj@xxxxxxxxxx> [240903 20:45]:
damon_test_three_regions_in_vmas() initializes a maple tree with
MM_MT_FLAGS. The flags contains MT_FLAGS_LOCK_EXTERN, which means
mt_lock of the maple tree will not be used. And therefore the maple
tree initialization code skips initialization of the mt_lock. However,
__link_vmas(), which adds vmas for test to the maple tree, uses the
mt_lock. In other words, the uninitialized spinlock is used. The
problem becomes celar when spinlock debugging is turned on, since it
reports spinlock bad magic bug. Fix the issue by not using the mt_lock
as promised.

You can't do this, lockdep will tell you this is wrong.

Hmm, but lockdep was silence on my setup?

We need a lock and to use the lock for writes.

This code is executed by a single-thread test code. Do we still need the lock?


I'd suggest using different flags so the spinlock is used.

The reporter mentioned simply dropping MT_FLAGS_LOCK_EXTERN from the flags
causes suspicious RCU usage message. May I ask if you have a suggestion of
better flags?

That would be the lockdep complaining, so that's good.


I was actually thinking replacing the mt_init_flags() with mt_init(), which
same to mt_init_flags() with zero flag, like below.

Yes. This will use the spinlock which should fix your issue, but it
will use a different style of maple tree.

Perhaps use MT_FLAGS_ALLOC_RANGE to use the same type of maple tree, if
you ever add threading you will want the rcu flag as well
(MT_FLAGS_USE_RCU).

I would recommend those two and just use the spinlock.


I tried that (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_USE_RCU). it also triggers
the suspicious RCU usage message.


I am running ./tools/testing/kunit/kunit.py run '*damon*' --arch x86_64 --raw
with:
CONFIG_LOCKDEP=y
CONFIG_DEBUG_SPINLOCK=y

and I don't have any issue with locking in the existing code. How do I
recreate this issue?


I tested again, and I still see


[ 6.233483] ok 4 damon
[ 6.234190] KTAP version 1
[ 6.234263] # Subtest: damon-operations
[ 6.234335] # module: vaddr
[ 6.234384] 1..6
[ 6.235726]
[ 6.235931] =============================
[ 6.236018] WARNING: suspicious RCU usage
[ 6.236280] 6.11.0-rc6-00029-gda66250b210f-dirty #1 Tainted: G N
[ 6.236398] -----------------------------
[ 6.236474] lib/maple_tree.c:832 suspicious rcu_dereference_check() usage!
[ 6.236579]
[ 6.236579] other info that might help us debug this:
[ 6.236579]
[ 6.236738]
[ 6.236738] rcu_scheduler_active = 2, debug_locks = 1
[ 6.237039] no locks held by kunit_try_catch/208.
[ 6.237166]
[ 6.237166] stack backtrace:
[ 6.237385] CPU: 0 UID: 0 PID: 208 Comm: kunit_try_catch Tainted: G N 6.11.0-rc6-00029-gda66250b210f-dirty #1
[ 6.237629] Tainted: [N]=TEST
[ 6.237714] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 6.238065] Call Trace:
[ 6.238233] <TASK>
[ 6.238547] dump_stack_lvl+0x9e/0xe0
[ 6.239473] lockdep_rcu_suspicious+0x145/0x1b0
[ 6.239621] mas_walk+0x19f/0x1d0
[ 6.239765] mas_find+0xb5/0x150
[ 6.239873] __damon_va_three_regions+0x7e/0x130

This function isn't taking the rcu read lock while iterating the tree.

Try this:

diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index b0e8b361891d..08cfd22b5249 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -126,6 +126,7 @@ static int __damon_va_three_regions(struct mm_struct *mm,
* If this is too slow, it can be optimised to examine the maple
* tree gaps.
*/
+ rcu_read_lock();
for_each_vma(vmi, vma) {
unsigned long gap;
@@ -146,6 +147,7 @@ static int __damon_va_three_regions(struct mm_struct *mm,
next:
prev = vma;
}
+ rcu_read_unlock();
if (!sz_range(&second_gap) || !sz_range(&first_gap))
return -EINVAL;



Yes, that fixes the problem for me.

Thanks,
Guenter