system locks up with CONFIG_SLS=Y; 5.17.0-rc

From: Jamie Heilman
Date: Wed Mar 16 2022 - 05:59:01 EST


I've been (somewhat unsuccessfully) trying to bisect a hard lock-up
of my workstation that occurs when I'm running 5.17 rc kernels a few
seconds after I start a kvm guest instance. There is no output to
any log, everything locks up completely, sysrq doesn't even work
anymore. As bisection progressed closer and closer to the branch
where straight-line-speculation mitigation was enabled, and as bisect
landing me between 9cdbeec40968 ("x86/entry_32: Fix segment exceptions")
and 3411506550b1 ("x86/csum: Rewrite/optimize csum_partial()") wasn't
resulting in clear results (my system definately starts Oopsing and
gets so hosed up that I'm forced to reboot, but it isn't quite as dire
as sysrq continues to function) I decided to just try a build with
CONFIG_SLS disabled, and it turns out that works just fine. Sooo...

This system uses a Intel Core2 Duo E8400 processor.
working config (CONFIG_SLS=N) and dmesg at:
http://audible.transient.net/~jamie/k/sls.config-5.17.0-rc8
http://audible.transient.net/~jamie/k/sls.dmesg

(I don't think the dmesg of CONFIG_SLS=Y is really any different.)

As far as I know the guest kernel I hand to qemu doesn't really
matter, but the gist of my qemu command line is:

qemu-system-x86_64 -m 2048 -name "$NAME" -machine pc,accel=kvm \
-nographic -no-user-config -nodefaults -boot strict=on \
-rtc base=utc -smp 1,sockets=1,cores=1,threads=1 \
-chardev pipe,id=char0,path="$DIR/monitor" \
-chardev pty,id=char1 \
-device isa-serial,chardev=char1 \
-device virtio-blk-pci,drive=blk0,bootindex=1 \
-device virtio-net-pci,netdev=net0,"mac=$IF_MAC" \
-device virtio-rng-pci,rng=rng0,max-bytes=1024,period=3000 \
-drive "id=blk0,file=/dev/S/$NAME,if=none,format=raw,cache=none" \
-mon chardev=char0,id=monitor,mode=control \
-netdev "tap,id=net0,ifname=$NAME,script=no,downscript=no" \
-object rng-random,id=rng0,filename=/dev/random


No clue what additional debugging would help to enable here, if
anything. As you can see from the dmesg, I'm using gcc 11.2.0 from
Debian unstable, 4:11.2.0-2 to be exact. Let me know what other
information would be useful.

--
Jamie Heilman http://audible.transient.net/~jamie/