Re: [PATCH] Drivers: vmbus: Check for channel allocation before looking up relids
From: Mohammed Gamal
Date: Fri Feb 10 2023 - 04:13:24 EST
(Re-CC'ing people from the old thread)
On Fri, Feb 10, 2023 at 4:57 AM Dexuan Cui <decui@xxxxxxxxxxxxx> wrote:
>
> > From: Mohammed Gamal <mgamal@xxxxxxxxxx>
> > Sent: Thursday, February 9, 2023 1:48 AM
> > ...
> > > We saw this when triggering a crash with kdump enabled with
> > > echo 'c' > /proc/sysrq-trigger
> > >
> > > When the new kernel boots, we see this stack trace:
>
> Thanks for the details. Kdump is special in that the 'old' VMBus
> channels might still be active (from the host's perspective),
> when the new kernel starts to run.
>
> Upon crash, Linux sends a CHANNELMSG_UNLOAD messge to the host,
> and the host is supposed to quiesce/reset the VMBus devices, so
> normally we should not see a crash in relid2channel().
Does this not happen in the case of kdump? Shouldn't a CHANNELMSG_UNLOAD
message be sent to the host in that case as well?
>
> > > [ 21.906679] Hardware name: Microsoft Corporation Virtual
> > > Machine/Virtual Machine, BIOS 090007 05/18/2018
>
> I guess you see the crash because you're running an old Hyper-V,
> probably Windows Server 2016 or 2019, which may be unable to
> reliably handle the guest's CHANNELMSG_UNLOAD messge.
We've actually seen this on Windows Server 2016, 2019, and 2022.
>
> Can you please mention kdump in the commit message?
>
Will do.
> BTW, regarding "before vmbus_connect() is called ", IMO it
> should be "before vmbus_connect() is called or before it finishes".