-----Original Message-----
From: Phillip Lougher <phillip@xxxxxxxxxxxxxxx>
Sent: Friday, November 18, 2022 12:11 AM
To: Mirsad Goran Todorovac <mirsad.todorovac@xxxxxxxxxxxx>; LKML <linux-
kernel@xxxxxxxxxxxxxxx>; Paul E. McKenney <paulmck@xxxxxxxxxx>
Cc: phillip.lougher@xxxxxxxxx; Thorsten Leemhuis
<regressions@xxxxxxxxxxxxx>
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in
squashfs_readahead())
On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
Hi,squashfs_xz_uncompress().
While trying to bisect, I've found another bug that predated the
introduction of squashfs_readahead(), but it has
a common denominator in squashfs_decompress() and
Wrong, the stall is happening in the XZ decompressor library, which
is *not* in Squashfs.
This reported stall in the decompressor code is likely a symptom of you
deliberately thrashing your system. When the system thrashes everything
starts to happen very slowly, and the system will spend a lot of
its time doing page I/O, and the CPU will spend a lot of time in
any CPU intensive code like the XZ decompressor library.
So the fact the stall is being hit here is a symptom and not
a cause. The decompressor code is likely running slowly due to
thrashing and waiting on paged-out buffers. This is not indicative
of any bug, merely a system running slowly due to overload.
As I said, this is not a Squashfs issue, because the code when the
stall takes place isn't in Squashfs.
The people responsible for the rcu code should have a lot more insight
about what happens when the system is thrashing, and how this will
throw up false positives. In this I believe this is an instance of
perfectly correct code running slowly due to thrashing incorrectly
being flagged as looping.
CC'ing Paul E. McKenney <paulmck@xxxxxxxxxx>
Phillip
How big can these readahead sizes be? Should one of the loops include
cond_resched() calls?