Re: [PATCH 12/12] crypto: caam - change JR device ownership scheme

From: Leonard Crestez
Date: Fri Sep 13 2019 - 15:16:22 EST


On 04.09.2019 05:35, Andrey Smirnov wrote:
> Returning -EBUSY from platform device's .remove() callback won't stop
> the removal process, so the code in caam_jr_remove() is not going to
> have the desired effect of preventing JR from being removed.
>
> In order to be able to deal with removal of the JR device, change the
> code as follows:
>
> 1. To make sure that underlying struct device remains valid for as
> long as we have a reference to it, add appropriate device
> refcount management to caam_jr_alloc() and caam_jr_free()
>
> 2. To make sure that device removal doesn't happen in parallel to
> use using the device in caam_jr_enqueue() augment the latter to
> acquire/release device lock for the duration of the subroutine
>
> 3. In order to handle the case when caam_jr_enqueue() is executed
> right after corresponding caam_jr_remove(), add code to check
> that driver data has not been set to NULL.

What happens if a driver is unbound while a transform is executing
asynchronously?

Doing get_device and put_device in caam_jr_alloc and caam_jr_free
doesn't protect you against an explicit unbind of caam_jr, that path
doesn't care about device reference counts. For example on imx6qp:

# echo 2102000.jr1 > /sys/bus/platform/drivers/caam_jr/unbind

CPU: 2 PID: 561 Comm: bash Not tainted 5.3.0-rc7-next-20190904
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Backtrace:
[<c010f8a4>] (dump_backtrace) from [<c010fc5c>] (show_stack+0x20/0x24)
[<c010fc3c>] (show_stack) from [<c0d21600>] (dump_stack+0xdc/0x114)
[<c0d21524>] (dump_stack) from [<c0955774>] (caam_jr_remove+0x24/0xf8)
[<c0955750>] (caam_jr_remove) from [<c06e2d98>]
(platform_drv_remove+0x34/0x4c)
[<c06e2d64>] (platform_drv_remove) from [<c06e0b74>]
(device_release_driver_internal+0xf8/0x1b0)
[<c06e0a7c>] (device_release_driver_internal) from [<c06e0c70>]
(device_driver_detach+0x20/0x24)
[<c06e0c50>] (device_driver_detach) from [<c06de5a4>]
(unbind_store+0x6c/0xe0)
[<c06de538>] (unbind_store) from [<c06dd950>] (drv_attr_store+0x30/0x3c)
[<c06dd920>] (drv_attr_store) from [<c0364c18>] (sysfs_kf_write+0x50/0x5c)
[<c0364bc8>] (sysfs_kf_write) from [<c0363e0c>]
(kernfs_fop_write+0x10c/0x1f8)
[<c0363d00>] (kernfs_fop_write) from [<c02c3a30>] (__vfs_write+0x44/0x1d0)
[<c02c39ec>] (__vfs_write) from [<c02c68ec>] (vfs_write+0xb0/0x194)
[<c02c683c>] (vfs_write) from [<c02c6b7c>] (ksys_write+0x64/0xd8)
[<c02c6b18>] (ksys_write) from [<c02c6c08>] (sys_write+0x18/0x1c)
[<c02c6bf0>] (sys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)

I think you need to do some form of slow wait loop in jrpriv until
jrpriv->tfm_count reaches zero. Preventing new operations from starting
would help but isn't strictly necessary for correctness.

> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> index 47b389cb1c62..8a30bbd7f2aa 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -124,14 +124,6 @@ static int caam_jr_remove(struct platform_device *pdev)
> jrdev = &pdev->dev;
> jrpriv = dev_get_drvdata(jrdev);
>
> - /*
> - * Return EBUSY if job ring already allocated.
> - */
> - if (atomic_read(&jrpriv->tfm_count)) {
> - dev_err(jrdev, "Device is busy\n");
> - return -EBUSY;
> - }
> -
> /* Unregister JR-based RNG & crypto algorithms */
> unregister_algs();
>
> @@ -300,7 +292,7 @@ struct device *caam_jr_alloc(void)
>
> if (min_jrpriv) {
> atomic_inc(&min_jrpriv->tfm_count);
> - dev = min_jrpriv->dev;
> + dev = get_device(min_jrpriv->dev);
> }
> spin_unlock(&driver_data.jr_alloc_lock);
>
> @@ -318,13 +310,16 @@ void caam_jr_free(struct device *rdev)
> struct caam_drv_private_jr *jrpriv = dev_get_drvdata(rdev);
>
> atomic_dec(&jrpriv->tfm_count);
> + put_device(rdev);
> }
> EXPORT_SYMBOL(caam_jr_free);
>
> /**
> * caam_jr_enqueue() - Enqueue a job descriptor head. Returns 0 if OK,
> * -EBUSY if the queue is full, -EIO if it cannot map the caller's
> - * descriptor.
> + * descriptor, -ENODEV if given device was removed and is no longer
> + * valid
> + *
> * @dev: device of the job ring to be used. This device should have
> * been assigned prior by caam_jr_register().
> * @desc: points to a job descriptor that execute our request. All
> @@ -354,15 +349,32 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
> u32 status, void *areq),
> void *areq)
> {
> - struct caam_drv_private_jr *jrp = dev_get_drvdata(dev);
> + struct caam_drv_private_jr *jrp;
> struct caam_jrentry_info *head_entry;
> int head, tail, desc_size;
> dma_addr_t desc_dma;
>
> + /*
> + * Lock the device to prevent it from being removed while we
> + * are using it
> + */
> + device_lock(dev);
> +
> + /*
> + * If driver data is NULL, it is very likely that this device
> + * was removed already. Nothing we can do here but bail out.
> + */
> + jrp = dev_get_drvdata(dev);
> + if (!jrp) {
> + device_unlock(dev);
> + return -ENODEV;
> + }
> +
> desc_size = (caam32_to_cpu(*desc) & HDR_JD_LENGTH_MASK) * sizeof(u32);
> desc_dma = dma_map_single(dev, desc, desc_size, DMA_TO_DEVICE);
> if (dma_mapping_error(dev, desc_dma)) {
> dev_err(dev, "caam_jr_enqueue(): can't map jobdesc\n");
> + device_unlock(dev);
> return -EIO;
> }
>
> @@ -375,6 +387,7 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
> CIRC_SPACE(head, tail, JOBR_DEPTH) <= 0) {
> spin_unlock_bh(&jrp->inplock);
> dma_unmap_single(dev, desc_dma, desc_size, DMA_TO_DEVICE);
> + device_unlock(dev);
> return -EBUSY;
> }
>
> @@ -411,6 +424,7 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
> jrp->inpring_avail = rd_reg32(&jrp->rregs->inpring_avail);
>
> spin_unlock_bh(&jrp->inplock);
> + device_unlock(dev);
>
> return 0;
> }