Re: [PATCH] Drivers: hv: vmbus: handle various crash scenarios

From: Vitaly Kuznetsov
Date: Tue Mar 22 2016 - 10:01:17 EST


KY Srinivasan <kys@xxxxxxxxxxxxx> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
>> Sent: Monday, March 21, 2016 12:52 AM
>> To: KY Srinivasan <kys@xxxxxxxxxxxxx>
>> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Haiyang
>> Zhang <haiyangz@xxxxxxxxxxxxx>; Alex Ng (LIS) <alexng@xxxxxxxxxxxxx>;
>> Radim Krcmar <rkrcmar@xxxxxxxxxx>; Cathy Avery <cavery@xxxxxxxxxx>
>> Subject: Re: [PATCH] Drivers: hv: vmbus: handle various crash scenarios
>>
>> KY Srinivasan <kys@xxxxxxxxxxxxx> writes:
>>
>> >> -----Original Message-----
>> >> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
>> >> Sent: Friday, March 18, 2016 5:33 AM
>> >> To: devel@xxxxxxxxxxxxxxxxxxxxxx
>> >> Cc: linux-kernel@xxxxxxxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>;
>> >> Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Alex Ng (LIS)
>> >> <alexng@xxxxxxxxxxxxx>; Radim Krcmar <rkrcmar@xxxxxxxxxx>; Cathy
>> >> Avery <cavery@xxxxxxxxxx>
>> >> Subject: [PATCH] Drivers: hv: vmbus: handle various crash scenarios
>> >>
>> >> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is
>> always
>> >> delivered to CPU0 regardless of what CPU we're sending
>> >> CHANNELMSG_UNLOAD
>> >> from. vmbus_wait_for_unload() doesn't account for the fact that in case
>> >> we're crashing on some other CPU and CPU0 is still alive and operational
>> >> CHANNELMSG_UNLOAD_RESPONSE will be delivered there completing
>> >> vmbus_connection.unload_event, our wait on the current CPU will never
>> >> end.
>> >
>> > What was the host you were testing on?
>> >
>>
>> I was testing on both 2012R2 and 2016TP4. The bug is easily reproducible
>> by forcing crash on a secondary CPU, e.g.:
>
> Prior to 2012R2, all messages would be delivered on CPU0 and this includes CHANNELMSG_UNLOAD_RESPONSE.
> For this reason we don't support kexec on pre-2012 R2 hosts. On 2012. From 2012 R2 on, all vmbus
> messages (responses) will be delivered on the CPU that we initially set up - look at the code in
> vmbus_negotiate_version(). So on post 2012 R2 hosts, the response to CHANNELMSG_UNLOAD_RESPONSE
> will be delivered on the CPU where we initiate the contact with the
> host - CHANNELMSG_INITIATE_CONTACT message.

Unfortunatelly there is a descrepancy between WS2012R2 and WS2016TP4. On
WS2012R2 what you're saying is true and all messages including
CHANNELMSG_UNLOAD_RESPONSE are delivered to the CPU we used for initial
contact. On WS2016TP4 CHANNELMSG_UNLOAD_RESPONSE seems to be a special
case and it is always delivered to CPU0, no matter which CPU we used for
initial contact. This can be a host bug. You can use the attached patch
to see the issue.

For now I can suggest we check message pages for all CPUs from
vmbus_wait_for_unload(). We can race with other CPUs again but we don't
care as we're checking for completion_done() in the loop as well. I'll
try this approach.

--
Vitaly