Re: Problems with Zen under Xen and recent Linux kernel improvements

From: Adam Novak
Date: Sun Aug 05 2018 - 17:35:40 EST


OK, I pulled that commit, 74899d92e66663dc7671a8017b3146dcd4735f3b, in
to the Ubuntu kernel and it seems to solve the problem. Now I just
need to get Ubuntu to ship it.

Thanks!

On Sun, Aug 5, 2018 at 12:27 PM, Adam Novak <interfect@xxxxxxxxx> wrote:
> Sorry, I am using Xen version 4.9.2, specifically 4.9.2-0ubuntu1.
>
> I am seeing the bug with *kernel* version 4.15.0, and specifically
> Ubuntu's tag Ubuntu-4.15.0-23.25. That appears to have the "x86/cpu:
> Re-apply forced caps every time CPU caps are re-read" patch, but not
> "x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths".
>
> I can try cherry-picking that commit. Are there other commits in
> particular that might need to be pulled into the Ubuntu kernel to get
> it to work?
>
> On Tue, Jul 31, 2018 at 4:58 AM, Juergen Gross <jgross@xxxxxxxx> wrote:
>> On 31/07/18 03:14, Adam Novak wrote:
>>> Hello,
>>>
>>> I was advised to take this here, and to Boris Ostrovsky and Juergen
>>> Gross, by Thomas Gleixner.
>>>
>>> I am having some trouble with the new speculation control code that
>>> has been added to the Linux kernel, for AMD Zen CPUs. I am running an
>>> AMD Ryzen 7 1700, and I am running Linux as a Xen dom0 (which is part
>>> of the problem; the code seems to work fine running outside of Xen).
>>>
>>> I started having trouble on Ubuntu's commit
>>> 3f6a3b035f91a22c0d3bd27630bf61eac9c8cf6c, "x86/speculation: Handle HT
>>> correctly on AMD", which appears to be cherry-picked from
>>> 1f50ddb4f4189243c05926b842dc1a0332195f31. Since that commit, my system
>>> hangs during the boot process; it starts starting stuff up and trying
>>> to mount things and printing "[OK]" messages, but then fairly early in
>>> the boot process the kernel complains that it is "unable to handle
>>> kernel NULL pointer deference at 000...0008"
>>>
>>> On my Ubuntu bug:
>>>
>>> https://bugs.launchpad.net/bugs/1777338
>>>
>>> I have a "Screenshot of the null pointer dereference message". It is
>>> running into trouble during a spin lock in the new
>>> speculative_store_bypass_update().
>>>
>>> Has anyone else seen this behavior on these CPUs under Xen (I am using 4.9)?
>>
>> You want at least 4.9.112, especially due to the missing patches
>> "x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths",
>> "x86/cpu: Re-apply forced caps every time CPU caps are re-read"
>>
>> Juergen