[PATCH] test_xarray: fix soft lockup for advanced-api tests

From: Luis Chamberlain
Date: Fri Feb 16 2024 - 14:43:42 EST


The new adanced API tests want to vet the xarray API is doing what it
promises by manually iterating over a set of possible indexes on its
own, and using a query operation which holds the RCU lock and then
releases it. So it is not using the helper loop options which xarray
provides on purpose. Any loop which iterates over 1 million entries
(which is possible with order 20, so emulating say a 4 GiB block size)
to just to rcu lock and unlock will eventually end up triggering a soft
lockup on systems which don't preempt, and have lock provin and RCU
prooving enabled.

xarray users already use XA_CHECK_SCHED for loops which may take a long
time, in our case we don't want to RCU unlock and lock as the caller
does that already, but rather just force a schedule every XA_CHECK_SCHED
iterations since the test is trying to not trust and rather test that
xarray is doing the right thing.

[0] https://lkml.kernel.org/r/202402071613.70f28243-lkp@xxxxxxxxx

Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---
lib/test_xarray.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index d4e55b4867dc..ac162025cc59 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -781,6 +781,7 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
{
XA_STATE(xas, xa, index);
void *p;
+ static unsigned int i = 0;

rcu_read_lock();
repeat:
@@ -790,6 +791,17 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
goto repeat;
rcu_read_unlock();

+ /*
+ * This is not part of the page cache, this selftest is pretty
+ * aggressive and does not want to trust the xarray API but rather
+ * test it, and for order 20 (4 GiB block size) we can loop over
+ * over a million entries which can cause a soft lockup. Page cache
+ * APIs won't be stupid, proper page cache APIs loop over the proper
+ * order so when using a larger order we skip shared entries.
+ */
+ if (++i % XA_CHECK_SCHED == 0)
+ schedule();
+
return p;
}

--
2.42.0