Re: [PATCH 4.4 00/37] 4.4.110-stable review
From: Andy Lutomirski
Date: Fri Jan 05 2018 - 11:57:44 EST
> On Jan 5, 2018, at 7:32 AM, Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> wrote:
>
> Hi Greg,
>
> Just tested on my machine:
> [ 0.000000] Initializing cgroup subsys cpuset
> [ 0.000000] Initializing cgroup subsys cpu
> [ 0.000000] Initializing cgroup subsys cpuacct
> [ 0.000000] Linux version 4.4.110_pt_linux-v4.4.110 (ptatashi@ca-ostest441) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Fri Jan 5 07:22:34 PST 2018
> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.4.110_pt_linux-v4.4.110 root=UUID=fe908085-0117-442b-a57c-ce651cc95b38 ro crashkernel=auto console=ttyS0,115200 LANG=en_US.UTF-8
> [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
> [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
> <cut>
> [ 3.457106] hub 1-0:1.0: USB hub found
> [ 3.461298] hub 1-0:1.0: 2 ports detected
> [ 3.466173] ehci-pci 0000:00:1d.0: EHCI Host Controller
> [ 3.472111] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 2
> [ 3.480381] ehci-pci 0000:00:1d.0: debug port 2
> [ 3.489571] ehci-pci 0000:00:1d.0: irq 18, io mem 0xc7101000
> [ 3.501393] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
> [ 3.507855] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
> [ 3.515436] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> [ 3.523500] usb usb2: Product: EHCI Host Controller
> [ 3.528947] usb usb2: Manufacturer: Linux 4.4.110_pt_linux-v4.4.110 ehci_hcd
> [ 3.536816] usb usb2: SerialNumber: 0000:00:1d.0
> [ 3.542107] hub 2-0:1.0: USB hub found
> [ 3.546301] hub 2-0:1.0: 2 ports detected
> [ 3.550942] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> [ 3.557854] ohci-pci: OHCI PCI platform driver
> [ 3.562844] uhci_hcd: USB Universal Host Controller Interface driver
> [ 3.570032] usbcore: registered new interface driver usbserial
> [ 3.576550] usbcore: registered new interface driver usbserial_generic
> [ 3.583844] usbserial: USB Serial support registered for generic
> [ 3.590570] i8042: PNP: No PS/2 controller found. Probing ports directly.
> [ 3.995383] tsc: Refined TSC clocksource calibration: 2195.099 MHz
> [ 4.002289] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fa41d170d9, max_idle_ns: 440795288527 ns
> [ 4.046414] usb 2-1: new high-speed USB device number 2 using ehci-pci
> [ 4.174758] usb 2-1: New USB device found, idVendor=8087, idProduct=8002
> [ 4.182245] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
> [ 4.190382] hub 2-1:1.0: USB hub found
> [ 4.194609] hub 2-1:1.0: 8 ports detected
> [ 4.637363] i8042: No controller found
> [ 4.641646] mousedev: PS/2 mouse device common for all mice
> [ 4.648117] rtc_cmos 00:00: RTC can wake from S4
> [ 4.653447] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
> [ 4.660272] rtc_cmos 00:00: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
> [ 4.669050] Intel P-state driver initializing.
> [ 4.676630] EFI Variables Facility v0.08 2004-May-17
> <hangs here>
> Reboots after about 30 seconds.
>
This looks like the KVM RSM issue. When you manage to run a buggy configuration (KVM + OVMF with secure boot support in the host, PCID (PTI or otherwise) and SMP in the guest), the first EFI call after AP bringup dies.
The actual failure is nasty. When one CPU calls into EFI, all the other CPUs die --they enter SMM and they don't come back out correctly. I think the best the guest could do is to try to generate a useful printk if this happens.
Update your host.
> Boots fine with nopti option.
>
> Thank you,
> Pavel