Re: ARM board lockups/hangs triggered by locks and mutexes

From: Rafał Miłecki
Date: Mon Aug 07 2023 - 07:10:31 EST


On 4.08.2023 13:07, Rafał Miłecki wrote:
I triple checked that. Dropping a single unused function breaks kernel /
device stability on BCM53573!

AFAIK the only thing below diff actually affects is location of symbols
(I actually verified that by comparing System.map before and after -
over 22'000 of relocated symbols).

Can some unfortunate location of symbols cause those hangs/lockups?

I performed another experiment. First I dropped mtd_check_of_node() to
bring kernel back to the stable state.

Then I started adding useless code to the mtdchar_unlocked_ioctl(). I
ended up adding just enough to make sure all post-mtd symbols in
System.map got the same offset as in case of backporting
mtd_check_of_node().

I started experiencing lockups/hangs again.

I repeated the same test with adding dumb code to the brcm_nvram_probe()
and verifying symbols offsets following brcm_nvram_probe one.

I believe this confirms that this problem is about offset or alignment
of some specific symbol(s). The remaining question is what symbols and
how to fix or workaround that.

Following dump change brings back lockups/hangs:

diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index ee437af41..0a24dec55 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -1028,6 +1028,22 @@ static long mtdchar_unlocked_ioctl(struct file *file, u_int cmd, u_long arg)
{
int ret;

+ if (!file)
+ pr_info("Missing\n");
+ WARN_ON(!file);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+ WARN_ON(cmd == 1234);
+ WARN_ON(cmd == 5678);
+
mutex_lock(&mtd_mutex);
ret = mtdchar_ioctl(file, cmd, arg);
mutex_unlock(&mtd_mutex);