Re: [PATCH v6] vfio error recovery: kernel support

From: Cao jin
Date: Thu Apr 06 2017 - 05:06:42 EST




On 04/06/2017 06:36 AM, Michael S. Tsirkin wrote:
> On Wed, Apr 05, 2017 at 04:19:10PM -0600, Alex Williamson wrote:
>> On Thu, 6 Apr 2017 00:50:22 +0300
>> "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
>>
>>> On Wed, Apr 05, 2017 at 01:38:22PM -0600, Alex Williamson wrote:
>>>> The previous intention of trying to handle all sorts of AER faults
>>>> clearly had more value, though even there the implementation and
>>>> configuration requirements restricted the practicality. For instance
>>>> is AER support actually useful to a customer if it requires all ports
>>>> of a multifunction device assigned to the VM? This seems more like a
>>>> feature targeting whole system partitioning rather than general VM
>>>> device assignment use cases. Maybe that's ok, but it should be a clear
>>>> design decision.
>>>
>>> Alex, what kind of testing do you expect to be necessary?
>>> Would you say testing on real hardware and making it trigger
>>> AER errors is a requirement?
>>
>> Testing various fatal, non-fatal, and corrected errors with aer-inject,
>> especially in multfunction configurations (where more than one port
>> is actually usable) would certainly be required. If we have cases where
>> the driver for a companion function can escalate a non-fatal error to a
>> bus reset, that should be tested, even if it requires temporary hacks to
>> the host driver for the companion function to trigger that case. AER
>> handling is not something that the typical user is going to experience,
>> so it should to be thoroughly tested to make sure it works when needed
>> or there's little point to doing it at all. Thanks,
>>
>> Alex
>
> Some things can be tested within a VM. What would you
> say would be sufficient on a VM and what has to be
> tested on bare metal?
>

Does the "bare metal" here mean something like XenServer?
--
Sincerely,
Cao jin