Re: [PATCH] crypto: caam - use the same jr for rng init/exit
From: Andrey Smirnov
Date: Wed Sep 18 2019 - 02:01:27 EST
On Wed, Sep 11, 2019 at 2:35 AM Horia Geanta <horia.geanta@xxxxxxx> wrote:
>
> On 9/4/2019 5:35 AM, Andrey Smirnov wrote:
> > In order to allow caam_jr_enqueue() to lock underlying JR's
> > device (via device_lock(), see commit that follows) we need to make
> > sure that no code calls caam_jr_enqueue() as a part of caam_jr_probe()
> > to avoid a deadlock. Unfortunately, current implementation of caamrng
> > code does exactly that in caam_init_buf().
> >
I don't think your patch addresses the above, so it can probably be
cut out of the message.
> > Another big problem with original caamrng initialization is a
> > circular reference in the form of:
> >
> > 1. caam_rng_init() aquires JR via caam_jr_alloc(). Freed only by
> > caam_rng_exit()
> >
> > 2. caam_rng_exit() is only called by unregister_algs() once last JR
> > is shut down
> >
> > 3. caam_jr_remove() won't call unregister_algs() for last JR
> > until tfm_count reaches zero, which can only happen via
> > unregister_algs() -> caam_rng_exit() call chain.
> >
> > To avoid all of that, convert caamrng code to a platform device driver
> > and extend caam_probe() to create corresponding platform device.
> >
> > Additionally, this change also allows us to remove any access to
> > struct caam_drv_private in caamrng.c as well as simplify resource
> > ownership/deallocation via devres.
> >
> I would avoid adding platform devices that don't have
> corresponding DT nodes.
>
> There's some generic advice here:
> https://www.kernel.org/doc/html/latest/driver-api/driver-model/platform.html#legacy-drivers-device-probing
>
> and there's also previous experience in caam driver:
> 6b175685b4a1 crypto: caam/qi - don't allocate an extra platform device
>
Hmm, I am not sure how that experience relates to the case we have
with hwrng, but OK, I'm going to assume that platform driver approach
is a no-go.
> The issue in caamrng is actually that caam/jr driver (jr.c) tries to call
> caam_rng_exit() on the last available jr device.
> Instead, caam_rng_exit() must be called on the same jr device that
> was used during caam_rng_init().
>
> Otherwise, by the time:
>
> void caam_rng_exit(void)
> {
> if (!init_done)
> return;
>
> caam_jr_free(rng_ctx->jrdev);
> hwrng_unregister(&caam_rng);
> [...]
>
> is executed, rng_ctx->jrdev might have been removed.
>
> This will cause an oops in caam_jr_free().
> caam_cleanup() - .cleanup hwrng callback that is called when doing
> hwrng_unregister() - also needs to be executed on that jr device.
>
> The problem can be easily reproduced as follows.
>
> If caamrng was initialized on jr0:
> [...]
> caam_jr 2101000.jr0: registering rng-caam
> [...]
>
> then by manually unbinding jr0 first, then jr1 (using i.MX6SX):
> # echo -n "2101000.jr0" > /sys/bus/platform/drivers/caam_jr/
> # echo -n "2102000.jr1" > /sys/bus/platform/drivers/caam_jr/
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000040
> pgd = 572e14e7
> [00000040] *pgd=be2e8831
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 629 Comm: sh Not tainted 5.3.0-rc1-00299-g8e2a2738e5d3-dirty #8
> Hardware name: Freescale i.MX6 SoloX (Device Tree)
> PC is at caam_jr_free+0xc/0x24
> LR is at caam_rng_exit+0x20/0x3c
> pc : [<c08aef20>] lr : [<c08bab1c>] psr: 200f0013
> sp : e9669e68 ip : 00001488 fp : 00000000
> r10: 00000000 r9 : e9669f80 r8 : e9284010
> r7 : e872d410 r6 : e872d410 r5 : e872d400 r4 : c1aa7cd8
> r3 : 00000000 r2 : 00000040 r1 : 00000000 r0 : e872d010
> Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 10c5387d Table: a969004a DAC: 00000051
> Process sh (pid: 629, stack limit = 0xfc1b6e94)
> Stack: (0xe9669e68 to 0xe966a000)
> 9e60: e865c940 c08af7dc e872d410 e872d410 c13a9cec c06b223c
> 9e80: c06b2218 e872d410 e81a9410 c06b08dc c13806f0 0000000b c13a9cec c06aeaf8
> 9ea0: 0000000b 00000000 0000000b e9284000 e91189c0 c0318c3c 00000000 00000000
> 9ec0: e95ddbd0 e8843500 c1308b08 c6614c9f e8843500 00438340 e9668000 00000004
> 9ee0: 00000000 c0285e00 00000001 00000000 00000000 c0288a44 00000000 00000000
> 9f00: c1308b28 00000000 00000001 c130911c 00000001 c13e81d1 c0288a44 00000000
> 9f20: e8ed9800 c019df00 e8ed9a7c c028ac08 00000001 00000000 c0288a44 c6614c9f
> 9f40: c1308b08 0000000b 00438340 e9669f80 e8843500 00438340 e9668000 c028899c
> 9f60: e95ddbc0 c6614c9f e8843500 e8843500 c1308b08 0000000b 00438340 c0288bfc
> 9f80: 00000000 00000000 00000000 c6614c9f 0000000b 00438340 b6ef1da0 00000004
> 9fa0: c01011c4 c0101000 0000000b 00438340 00000001 00438340 0000000b 00000000
> 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000
> 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c 600f0010 00000001 00000000 00000000
> [<c08aef20>] (caam_jr_free) from [<c08bab1c>] (caam_rng_exit+0x20/0x3c)
> [<c08bab1c>] (caam_rng_exit) from [<c08af7dc>] (caam_jr_remove+0x38/0xc0)
> [<c08af7dc>] (caam_jr_remove) from [<c06b223c>] (platform_drv_remove+0x24/0x3c)
> [<c06b223c>] (platform_drv_remove) from [<c06b08dc>] (device_release_driver_internal+0xdc/0x1a0)
> [<c06b08dc>] (device_release_driver_internal) from [<c06aeaf8>] (unbind_store+0x5c/0xcc)
> [<c06aeaf8>] (unbind_store) from [<c0318c3c>] (kernfs_fop_write+0xfc/0x1e0)
> [<c0318c3c>] (kernfs_fop_write) from [<c0285e00>] (__vfs_write+0x2c/0x1d0)
> [<c0285e00>] (__vfs_write) from [<c028899c>] (vfs_write+0xa0/0x180)
> [<c028899c>] (vfs_write) from [<c0288bfc>] (ksys_write+0x5c/0xdc)
> [<c0288bfc>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
> Exception stack(0xe9669fa8 to 0xe9669ff0)
> 9fa0: 0000000b 00438340 00000001 00438340 0000000b 00000000
> 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000
> 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c
> Code: eaffff49 e5903040 e2832040 f5d2f000 (e1921f9f)
>
> Thus, I'd say the following fix is needed:
>
> -- >8 --
>
> When caam_rng_exit() executes, the jr device that was used
> during caam_rng_init() must be available.
>
> This means that current scheme - where caam_rng_exit() is called
> when last jr device is removed - is incorrect.
> Instead, caam_rng_exit() has to run when the jr acquired
> during caam_rng_init() is removed.
>
> Fixes: 1b46c90c8e00 ("crypto: caam - convert top level drivers to libraries")
> Signed-off-by: Horia GeantÄ <horia.geanta@xxxxxxx>
>
> diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
> index e8baacaabe07..ec40178fa688 100644
> --- a/drivers/crypto/caam/caamrng.c
> +++ b/drivers/crypto/caam/caamrng.c
> @@ -300,9 +300,9 @@ static struct hwrng caam_rng = {
> .read = caam_read,
> };
>
> -void caam_rng_exit(void)
> +void caam_rng_exit(struct device *jrdev)
> {
> - if (!init_done)
> + if (!init_done || jrdev != rng_ctx->jrdev)
> return;
>
> caam_jr_free(rng_ctx->jrdev);
> diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h
> index 731b06becd9c..4795530203ad 100644
> --- a/drivers/crypto/caam/intern.h
> +++ b/drivers/crypto/caam/intern.h
> @@ -165,7 +165,7 @@ static inline void caam_pkc_exit(void)
> #ifdef CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API
>
> int caam_rng_init(struct device *dev);
> -void caam_rng_exit(void);
> +void caam_rng_exit(struct device *jrdev);
>
> #else
>
> @@ -174,7 +174,7 @@ static inline int caam_rng_init(struct device *dev)
> return 0;
> }
>
> -static inline void caam_rng_exit(void)
> +static inline void caam_rng_exit(struct device *jrdev)
> {
> }
>
> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> index d11956bc358f..61aea11773a6 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -53,7 +53,6 @@ static void unregister_algs(void)
>
> caam_qi_algapi_exit();
>
> - caam_rng_exit();
> caam_pkc_exit();
> caam_algapi_hash_exit();
> caam_algapi_exit();
> @@ -126,6 +125,8 @@ static int caam_jr_remove(struct platform_device *pdev)
> jrdev = &pdev->dev;
> jrpriv = dev_get_drvdata(jrdev);
>
> + caam_rng_exit(jrdev);
> +
> /*
> * Return EBUSY if job ring already allocated.
> */