Re: [PATCH v2] powerpc/pseries/eeh: Fix pseries_eeh_err_inject

From: Michael Ellerman
Date: Tue Sep 10 2024 - 03:22:32 EST


Narayana Murty N <nnmlinux@xxxxxxxxxxxxx> writes:
> On 05/09/24 6:33 PM, Michael Ellerman wrote:
>> Narayana Murty N <nnmlinux@xxxxxxxxxxxxx> writes:
>>> VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries
>>> due to missing implementation of err_inject eeh_ops for pseries.
>>> This patch implements pseries_eeh_err_inject in eeh_ops/pseries
>>> eeh_ops. Implements support for injecting MMIO load/store error
>>> for testing from user space.
>>>
>>> The check on PCI error type code is moved to platform code, since
>>> the eeh_pe_inject_err can be allowed to more error types depending
>>> on platform requirement.
>>>
>>> Signed-off-by: Narayana Murty N <nnmlinux@xxxxxxxxxxxxx>
>>> ---
>>>
>>> Testing:
>>> ========
>>> vfio-test [1] by Alex Willamson, was forked and updated to add
>>> support inject error on pSeries guest and used to test this
>>> patch[2].
>>>
>>> References:
>>> ===========
>>> [1] https://github.com/awilliam/tests
>>> [2] https://github.com/nnmwebmin/vfio-ppc-tests/tree/vfio-ppc-ex
>>>
>>> ================
>>> Changelog:
>>> V1:https://lore.kernel.org/all/20240822082713.529982-1-nnmlinux@xxxxxxxxxxxxx/
>>> - Resolved build issues for ppc64|le_defconfig by moving the
>>> pseries_eeh_err_inject() definition outside of the CONFIG_PCI_IOV
>>> code block.
>>> - New eeh_pe_inject_mmio_error wrapper function added to avoid
>>> CONFIG_EEH is not set.
>>
>> I don't see why that's necessary?
>>
>> It's only called from eeh_pseries.c, which is only built for
>> PPC_PSERIES, and when PPC_PSERIES=y, EEH is always enabled.
>>
>>> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>> index 91a9fd53254f..8da6b047a4fe 100644
>>> --- a/arch/powerpc/include/asm/eeh.h
>>> +++ b/arch/powerpc/include/asm/eeh.h
>>> @@ -308,7 +308,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option, bool include_passed);
>>> int eeh_pe_configure(struct eeh_pe *pe);
>>> int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>>> unsigned long addr, unsigned long mask);
>>> -
>>> +int eeh_pe_inject_mmio_error(struct pci_dev *pdev);
>>> /**
>>> * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
>>> *
>>> @@ -338,6 +338,10 @@ static inline int eeh_check_failure(const volatile void __iomem *token)
>>> return 0;
>>> }
>>>
>>> +static inline int eeh_pe_inject_mmio_error(struct pci_dev *pdev)
>>> +{
>>> + return -ENXIO;
>>> +}
>>> #define eeh_dev_check_failure(x) (0)
>>>
>>> static inline void eeh_addr_cache_init(void) { }
>>> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>>> index d03f17987fca..49ab11a287a3 100644
>>> --- a/arch/powerpc/kernel/eeh.c
>>> +++ b/arch/powerpc/kernel/eeh.c
>>> @@ -1537,10 +1537,6 @@ int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>>> if (!eeh_ops || !eeh_ops->err_inject)
>>> return -ENOENT;
>>>
>>> - /* Check on PCI error type */
>>> - if (type != EEH_ERR_TYPE_32 && type != EEH_ERR_TYPE_64)
>>> - return -EINVAL;
>>> -
>>
>> The change log should mention why it's OK to remove these checks. You
>> add the same checks in pseries_eeh_err_inject(), but what about
>> pnv_eeh_err_inject() ?
>>
>> It is OK AFAICS, because pnv_eeh_err_inject() already contains
>> equivalent checks, but you should spell that out.
>>
>> cheers
>
> yes mpe. I do agree, your comments are addressed in V3 posted
>
> here
> https://lore.kernel.org/all/20240909140220.529333-1-nnmlinux@xxxxxxxxxxxxx/

Thanks.

cheers