Re: Regression on todays tip/master (commit 16f70beccf43)

From: Joerg Roedel
Date: Fri Jul 24 2020 - 10:50:59 EST


On Fri, Jul 24, 2020 at 03:28:02PM +0200, Ingo Molnar wrote:
> Given that you are perf stress-testing the box, some recent perf
> commit would be the primary suspect - before doing a full bisect you
> might want to try current perf/core (2ac5413e5edc) and its upstream
> base: v5.8-rc3, to narrow it down.
>
> But in principle any other commit could be the cause as well, the
> assert suggests memory corruption - I don't think we changed anything
> in the signal code.

I tried to bisec, but it didn't yield something useful yet. The outcome
was commit

commit 1abdfe706a579a702799fce465bceb9fb01d407c
Author: Alex Belits <abelits@xxxxxxxxxxx>
Date: Thu Jun 25 18:34:41 2020 -0400

lib: Restrict cpumask_local_spread to houskeeping CPUs

But it looks totally unrelated to the backtrace I am seeing, and
reverting it didn't fix the problem.

Next thing is, I can reliable reproduce it with yesterdays tip/master
(commit 16f70beccf43), but did not see it with tip/master pulled today
(commit c02699cd25e8) yet.

To trigger it is sufficient to run the test_syscall_vdso_32 self-test in
a loop, ideally multiple $times, where $times > `nproc`. It usually
triggers withing the first 5 minutes in my test VMs. It turned out that
a running perf is not needed to trigger it.

Regards,

Joerg