Re: general protection fault in __schedule (2)

From: Dmitry Vyukov
Date: Sat Nov 23 2019 - 00:17:27 EST

On Fri, Nov 22, 2019 at 9:54 PM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
> On Thu, Nov 21, 2019 at 11:19:00PM -0800, syzbot wrote:
> > syzbot has bisected this bug to:
> >
> > commit 8fcc4b5923af5de58b80b53a069453b135693304
> > Author: Jim Mattson <jmattson@xxxxxxxxxx>
> > Date: Tue Jul 10 09:27:20 2018 +0000
> >
> > kvm: nVMX: Introduce KVM_CAP_NESTED_STATE
> >
> > bisection log:
> > start commit: 234b69e3 ocfs2: fix ocfs2 read block panic
> > git tree: upstream
> > final crash:
> > console output:
> > kernel config:
> > dashboard link:
> > syz repro:
> > C reproducer:
> >
> > Reported-by: syzbot+7e2ab84953e4084a638d@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
> >
> > For information about bisection process see:
> Is there a way to have syzbot stop processing/bisecting these things
> after a reasonable amount of time? The original crash is from August of
> last year...
> Note, the original crash is actually due to KVM's put_kvm() fd race, but
> whatever we want to blame, it's a duplicate.
> #syz dup: general protection fault in kvm_lapic_hv_timer_in_use

Hi Sean,

syzbot only sends bisection results to open bugs with no known fixes.
So what you did (marking the bug as invalid/dup, or attaching a fix)
would stop it from doing/sending bisection.

"Original crash happened a long time ago" is not necessary a good
signal. On the syzbot dashboard
(, you can see bugs with the
original crash 2+ years ago, but they are still pretty much relevant.
The default kernel development process strategy for invalidating bug
reports by burying them in oblivion has advantages, but also
downsides. FWIW syzbot prefers explicit status tracking.

Besides implications on the mainline development, consider the
following. We regularly discover the same bugs (missed backports) on
LTS kernels:
The dashboard also shows similar crash signatures in other tested
kernels. So say you see a crash in your product kernel, and you notice
that a similar crash happened on mainline some time ago, but
presumably it was fixed, but then you look at the bug report thread
and there is no info whatsoever as to what happened.
Now this bug report:
is linked to "general protection fault in kvm_lapic_hv_timer_in_use":
which has a recorded fix "KVM: nVMX: Fix bad cleanup on error of
get/set nested state IOCTLs":