Re: [PATCH v9 0/8] Parallel CPU bringup for x86_64
From: David Woodhouse
Date: Mon Feb 20 2023 - 18:31:08 EST
On Mon, 2023-02-20 at 17:23 -0600, Kim Phillips wrote:
> On 2/20/23 3:39 PM, David Woodhouse wrote:
> > On 20 February 2023 21:23:38 GMT, Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> wrote:
> > > Hello.
> > >
> > > On 20.02.2023 21:31, David Woodhouse wrote:
> > > > On Mon, 2023-02-20 at 17:40 +0100, Oleksandr Natalenko wrote:
> > > > > On pondělí 20. února 2023 17:20:13 CET David Woodhouse wrote:
> > > > > > On Mon, 2023-02-20 at 17:08 +0100, Oleksandr Natalenko wrote:
> > > > > > >
> > > > > > > I've applied this to the v6.2 kernel, and suspend/resume broke on
> > > > > > > my
> > > > > > > Ryzen 5950X desktop. The machine suspends just fine, but on
> > > > > > > resume
> > > > > > > the screen stays blank, and there's no visible disk I/O.
> > > > > > >
> > > > > > > Reverting the series brings suspend/resume back to working state.
> > > > > >
> > > > > > Hm, thanks. What if you add 'no_parallel_bringup' on the command
> > > > > > line?
> > > > >
> > > > > If the `no_parallel_bringup` param is added, the suspend/resume
> > > > > works.
> > > >
> > > > Thanks for the testing. Can I ask you to do one further test: apply the
> > > > series only as far as patch 6/8 'x86/smpboot: Support parallel startup
> > > > of secondary CPUs'.
> > > >
> > > > That will do the new startup asm sequence where each CPU finds its own
> > > > per-cpu data so it *could* work in parallel, but doesn't actually do
> > > > the bringup in parallel yet.
> > >
> > > With patches 1 to 6 (including) applied and no extra cmdline
> > > params added the resume doesn't work.
> >
> > Hm. Kim, is there some weirdness with the way AMD CPUs get their
> > APIC ID in CPUID 0x1? Especially after resume?
>
> Not to my knowledge. Mario?
Oleksandr, please could you show the output of 'cpuid' after a
successful resume? I'm particularly looking for this part...
$ sudo cpuid | grep -A1 1/ebx
miscellaneous (1/ebx):
process local APIC physical ID = 0x0 (0)
--
miscellaneous (1/ebx):
process local APIC physical ID = 0x2 (2)
...
> > Perhaps we turn it off for any AMD CPU that doesn't have X2APIC and CPUID 0xB?
>
> Perhaps.
>
> > > > Does your box have a proper serial port?
> > >
> > > No, sorry. I know it'd help with getting logs, and I do have a serial-to-USB cable that I use for another machine, but in this one the port is not routed to outside. I think I can put a header there as the motherboard does have pins, but I'd have to buy one first. In theory, I can do that, but that won't happen within the next few weeks.
> > >
> > > P.S. Piotr Gorski (in Cc) also reported this: "My friend from CachyOS can confirm bugs with smpboot patches. AMD FX 6300 only shows 1 core when using smp boot patchset". Probably, he can reply to this thread and provide more details.
> > >
>
> I ran mem/disk versions of 'sudo rtcwake --mode mem -s 60'
> on my Rome server, and multiple suspend/resumes succeeded, and
> with all CPUs, but then the NETDEV WATCHDOG fired - not sure
> if it's related:
I suspect not.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature