Re: [BUG] Re: Linux 6.4.4

From: Joel Fernandes
Date: Sun Jul 23 2023 - 10:56:41 EST




On 7/22/23 13:27, Paul E. McKenney wrote:
[..]
>
> OK, if this kernel is non-preemptible, you are not running TREE03,
> correct?
>
>> Next plan of action is to get sched_waking stack traces since I have a
>> very reliable repro of this now.
>
> Too much fun! ;-)

For TREE07 issue, it is actually the schedule_timeout_interruptible(1)
in stutter_wait() that is beating up the CPU0 for 4 seconds.

This is very similar to the issue I fixed in New year in d52d3a2bf408
("torture: Fix hang during kthread shutdown phase")

Adding a cond_resched() there also did not help.

I think the issue is the stutter thread fails to move spt forward
because it does not get CPU time. But spt == 1 should be very brief
AFAIU. I was wondering if we could set that to RT.

But also maybe the following will cure it like it did for the shutdown
issue, giving the stutter thread just enough CPU time to move spt forward.

Now I am trying the following and will let it run while I go do other
family related things. ;)

+++ b/kernel/torture.c
@@ -733,6 +733,6 @@ bool stutter_wait(const char *title)
ret = true;
}
if (spt == 1) {
- schedule_timeout_interruptible(1);
+ schedule_timeout_interruptible(HZ / 20);
cond_resched();
} else if (spt == 2) {