Re: [PATCH v16 02/10] PCI/CXL: Update unregistration for AER-CXL and CPER-CXL kfifos
From: Bowman, Terry
Date: Wed Mar 11 2026 - 11:07:42 EST
On 3/9/2026 7:27 AM, Jonathan Cameron wrote:
> On Mon, 2 Mar 2026 14:36:40 -0600
> Terry Bowman <terry.bowman@xxxxxxx> wrote:
>
>> The current AER-CXL kfifo unregistration does not cancel pending work after
>> clearing the work function pointer. In addition, cancel_work_sync() is
>> called on behalf of the CPER-CXL kfifo in cxl_ras_exit() and should be
>> moved into the kfifo deregistration function.
>>
>> Add logic to cancel the AER-CXL kfifo's pending work in
>> cxl_unregister_proto_err_work().
>>
>> Move the CPER-CXL kfifo cancel call from cxl_ras_exit() to
>> cxl_cper_unregister_prot_err_work(). Release the CPER-CXL spinlock
>> before calling w to avoid deadlock.
>>
>> In both kfifo unregistration cases, add the necessary synchronization
>> to enforce proper lock ordering: protect pointer updates under the
>> lock, and clear the work pointer, then cancel any outstanding work
>> after the lock is released.
>
> From that description, this feels like it's walking the edge of
> being a fix? If so should call it out as such. If not, make it
> clear there isn't a known bug being fixed up.
>
> Otherwise seems sensible to me
> Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
>
I'll add explanation detailing a call to cancel_sync_work() will prevent the
possibility of using the work routine after its unregistered.
Thanks for reviewing.
- Terry
>>
>> Link: https://lore.kernel.org/linux-cxl/6982ca54e094b_55fa1005@dwillia2-mobl4.notmuch/
>> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
>> Assisted-by: Azure:gtp-4.1-nano-key
>>
>> ----
>>
>> Changes in v16:
>> - New commit
>> ---
>> drivers/acpi/apei/ghes.c | 6 +++++-
>> drivers/cxl/core/ras.c | 1 -
>> drivers/pci/pcie/aer_cxl_vh.c | 9 ++++++++-
>> 3 files changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
>> index 8acd2742bb27..de935e0e1dcf 100644
>> --- a/drivers/acpi/apei/ghes.c
>> +++ b/drivers/acpi/apei/ghes.c
>> @@ -776,8 +776,12 @@ int cxl_cper_unregister_prot_err_work(struct work_struct *work)
>> if (cxl_cper_prot_err_work != work)
>> return -EINVAL;
>>
>> - guard(spinlock)(&cxl_cper_prot_err_work_lock);
>> + spin_lock(&cxl_cper_prot_err_work_lock);
>> cxl_cper_prot_err_work = NULL;
>> + spin_unlock(&cxl_cper_prot_err_work_lock);
>> +
>> + cancel_work_sync(work);
>> +
>> return 0;
>> }
>> EXPORT_SYMBOL_NS_GPL(cxl_cper_unregister_prot_err_work, "CXL");
>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>> index 006c6ffc2f56..949d8c8ecdfe 100644
>> --- a/drivers/cxl/core/ras.c
>> +++ b/drivers/cxl/core/ras.c
>> @@ -124,7 +124,6 @@ int cxl_ras_init(void)
>> void cxl_ras_exit(void)
>> {
>> cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>> - cancel_work_sync(&cxl_cper_prot_err_work);
>> }
>>
>> static void cxl_dport_map_ras(struct cxl_dport *dport)
>> diff --git a/drivers/pci/pcie/aer_cxl_vh.c b/drivers/pci/pcie/aer_cxl_vh.c
>> index 7e2bc1894395..ebca1112652a 100644
>> --- a/drivers/pci/pcie/aer_cxl_vh.c
>> +++ b/drivers/pci/pcie/aer_cxl_vh.c
>> @@ -74,8 +74,15 @@ EXPORT_SYMBOL_NS_GPL(cxl_register_proto_err_work, "CXL");
>>
>> void cxl_unregister_proto_err_work(void)
>> {
>> - guard(rwsem_write)(&cxl_proto_err_kfifo.rwsema);
>> + struct work_struct *work;
>> +
>> + down_write(&cxl_proto_err_kfifo.rwsema);
>> + work = cxl_proto_err_kfifo.work;
>> cxl_proto_err_kfifo.work = NULL;
>> + up_write(&cxl_proto_err_kfifo.rwsema);
>> +
>> + if (work)
>> + cancel_work_sync(work);
>> }
>> EXPORT_SYMBOL_NS_GPL(cxl_unregister_proto_err_work, "CXL");
>>
>