Re: [PATCH] dmaengine: idxd: Change wmb() to smp_wmb() when copying completion record to user space
From: Dave Jiang
Date: Tue Jan 30 2024 - 10:35:35 EST
On 1/29/24 19:58, Fenghua Yu wrote:
> wmb() is used to ensure status in the completion record is written
> after the rest of the completion record, making it visible to the user.
> However, on SMP systems, this may not guarantee visibility across
> different CPUs.
>
> Considering this scenario that event log handler is running on CPU1 while
> user app is polling completion record (cr) status on CPU2:
>
> CPU1 CPU2
> event log handler user app
>
> 1. cr = 0 (status = 0)
> 2. copy X to user cr except "status"
> 3. wmb()
> 4. copy Y to user cr "status"
> 5. poll status value Y
> 6. read rest cr which is still 0.
> cr handling fails
> 7. cr value X visible now
>
> Although wmb() ensure value Y is written and visible after X is written
> on CPU1, the order is not guaranteed on CPU2. So user app may see status
> value Y while cr value X is still not visible yet on CPU2. This will
> cause reading 0 from the rest of cr and cr handling fails.
>
> Changing wmb() to smp_wmb() ensures Y is written after X on both CPU1
> and CPU2. This guarantees that user app can consume cr in right order.
>
> Fixes: b022f59725f0 ("dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling")
> Suggested-by: Nikhil Rao <nikhil.rao@xxxxxxxxx>
> Tested-by: Tony Zhu <tony.zhu@xxxxxxxxx>
> Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> ---
> drivers/dma/idxd/cdev.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
> index 77f8885cf407..9b7388a23cbe 100644
> --- a/drivers/dma/idxd/cdev.c
> +++ b/drivers/dma/idxd/cdev.c
> @@ -681,9 +681,10 @@ int idxd_copy_cr(struct idxd_wq *wq, ioasid_t pasid, unsigned long addr,
> * Ensure that the completion record's status field is written
> * after the rest of the completion record has been written.
> * This ensures that the user receives the correct completion
> - * record information once polling for a non-zero status.
> + * record information on any CPU once polling for a non-zero
> + * status.
> */
> - wmb();
> + smp_wmb();
> status = *(u8 *)cr;
> if (put_user(status, (u8 __user *)addr))
> left += status_size;