Re: [RESEND v13 23/25] CXL/PCI: Introduce CXL uncorrectable protocol error recovery

From: Dave Jiang

Date: Wed Nov 05 2025 - 11:10:40 EST




On 11/5/25 7:59 AM, Bowman, Terry wrote:
>
>
> On 11/4/2025 5:43 PM, Dave Jiang wrote:
>>
>> On 11/4/25 11:47 AM, Jonathan Cameron wrote:
>>> On Tue, 4 Nov 2025 11:03:03 -0600
>>> Terry Bowman <terry.bowman@xxxxxxx> wrote:

<snip>

>>>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>>>> index 5bc144cde0ee..52c6f19564b6 100644
>>>> --- a/drivers/cxl/core/ras.c
>>>> +++ b/drivers/cxl/core/ras.c
>>>> @@ -259,8 +259,138 @@ static void device_unlock_if(struct device *dev, bool take)
>>>> device_unlock(dev);
>>>> }
>>>>
>>>> +/**
>>>> + * cxl_report_error_detected
>>>> + * @dev: Device being reported
>>>> + * @data: Result
>>>> + * @err_pdev: Device with initial detected error. Is locked immediately
>>>> + * after KFIFO dequeue.
>>>> + */
>>>> +static int cxl_report_error_detected(struct device *dev, void *data, struct pci_dev *err_pdev)
>>>> +{
>>>> + bool need_lock = (dev != &err_pdev->dev);
>>> Add a comment on why this controls need for locking.
>>> The resulting code is complex enough I'd be tempted to split the whole
>>> thing into locked and unlocked variants.
>> May not be a bad idea. Terry, can you see if this would reduce the complexity?
>>
>> DJ
>
> I agree and will split into 2 functions. Do you have naming suggestions for a function copy 
> without locks? Is cxl_report_error_detected_nolock() OK to go along with existing 
> cxl_report_error_detected()? 

Maybe cxl_report_error_detected_lock() vs cxl_report_error_detected().
I think there's also precedent of __cxl_report_error_detected() with no lock and indicates a raw function vs cxl_report_error_detected() with lock.

DJ

>
> Terry