RE: [EXT] [RFC PATCH] crypto: caam - restore retry count after HW RNG failure

From: Gaurav Jain
Date: Mon Jan 24 2022 - 02:04:53 EST


Hello Petr

> -----Original Message-----
> From: Petr Benes <petrben@xxxxxxxxx>
> Sent: Thursday, January 20, 2022 3:29 PM
> To: Gaurav Jain <gaurav.jain@xxxxxxx>
> Cc: Horia Geanta <horia.geanta@xxxxxxx>; Pankaj Gupta
> <pankaj.gupta@xxxxxxx>; Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>; Michal
> Vokáč <michal.vokac@xxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>;
> Petr Benes <petr.benes@xxxxxxxxx>; l.stach@xxxxxxxxxxxxxx;
> robert.hancock@xxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Varun Sethi <V.Sethi@xxxxxxx>; Vabhav Sharma
> <vabhav.sharma@xxxxxxx>
> Subject: Re: [EXT] [RFC PATCH] crypto: caam - restore retry count after HW RNG
> failure
>
> Caution: EXT Email
>
> Hi Gaurav,
>
> I've tested 3 i.MX6DL devices over ~day and got these results (decimal values).
>
> RTMCTL: 12289 all cases
>
> RTSTATUS:
>
> device A
> 1 x 49152
> 10 x 32768
>
> device B
> 13 x 32768
> 1 x 49152
> 1 x 1024
> 1 x 2048
>
> device C
> 1 x 32768
>

Thank you Petr for the register dump. Looks like statistical tests are failing.
We are internally looking into TRNG configuration which could cause statistical test failure.

Can you confirm if you have reproduced the problem with the patches suggested in Uboot?
Link for the same is http://patchwork.ozlabs.org/project/uboot/list/?series=280725
With this patch series TRNG is configured in uboot.

Regards
Gaurav Jain

> Regards,
> Petr
>
> On Thu, 13 Jan 2022 at 08:23, Gaurav Jain <gaurav.jain@xxxxxxx> wrote:
> >
> > Hello Michal
> >
> > >
> > > -----Original Message-----
> > > From: Michal Vokáč <michal.vokac@xxxxxxxxx>
> > > Sent: Friday, November 12, 2021 8:57 PM
> > > To: Horia Geanta <horia.geanta@xxxxxxx>; Pankaj Gupta
> > > <pankaj.gupta@xxxxxxx>; Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> > > Cc: David S. Miller <davem@xxxxxxxxxxxxx>;
> > > linux-crypto@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > l.stach@xxxxxxxxxxxxxx; robert.hancock@xxxxxxxxxx; Petr Benes
> > > <petr.benes@xxxxxxxxx>; petrben@xxxxxxxxx; Michal Vokáč
> > > <michal.vokac@xxxxxxxxx>
> > > Subject: [EXT] [RFC PATCH] crypto: caam - restore retry count after
> > > HW RNG failure
> > >
> > > Caution: EXT Email
> > >
> > > From: Petr Benes <petr.benes@xxxxxxxxx>
> > >
> > > Each time TRNG generates entropy, statistical tests are run.
> > > If they fail, RETRY_COUNT value is decremented. Once it reaches 0,
> > > HW RNG returns an error, and needs to be reset.
> >
> > The least-significant 16 bits of the RTSTATUS register reflect the result of each
> of statistical tests.
> > Can you dump RTSTATUS to see which test has failed? Please dump RTMCTL as
> well.
> >
> > Have you tried to build this patch for armv8? For me it is failing.
> >
> > Regards
> > Gaurav Jain
> >
> > > RETRY_COUNT could be programmed in RTSCMISC register and is set to 1
> > > by default. Hence, we are left without hwrng after the first error,
> > > which could happen even under normal conditions.
> > >
> > > Cc: petrben@xxxxxxxxx
> > > Signed-off-by: Petr Benes <petr.benes@xxxxxxxxx>
> > > Signed-off-by: Michal Vokáč <michal.vokac@xxxxxxxxx>
> > > ---
> > > Hi,
> > > we are also experiencing this issue:
> > >
> > > caam_jr 2101000.jr0: 20003c5b: CCB: desc idx 60: RNG: Hardware error.
> > >
> > > It is happening on both i.MX6S and i.MX6DL SoCs we use.
> > > On Solo I can reproduce it really fast. Sometimes it happens right
> > > after the board is NFS booted, sometimes I need to stress the HWRNG
> > > for a while (generate few hundred KBs of random data). On some
> > > DualLite SoCs it is happening at least once a day.
> > >
> > > We are using the v5.10 LTS branch but I can confirm that this is
> > > happening on all kernels since v5.7 to the latest linux-next.
> > >
> > > We also tried to increase the RTSDCTL_ENT_DLY_MIN delay as suggested
> > > in this thread [1]. It helped and the issue never occurred since
> > > then but we are looking for more universal and permanent solution
> > > suitable for upstream, hence we came up with this patch.
> > >
> > > Any comments will be appreciated.
> > > Thanks, Michal
> > >
> > > [1]
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flk
> > >
> ml.org%2F&amp;data=04%7C01%7Cgaurav.jain%40nxp.com%7Cad793bd423a6
> 4f2
> > >
> b03d408d9dbfb842f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6
> 3778
> > >
> 2695582123236%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ
> IjoiV2
> > >
> luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=PlPYDNHg9%2
> BCL
> > > iZrcdmOx%2FPdbAZmFpPZFvLe1jR4YhEI%3D&amp;reserved=0
> > > %2Flkml%2F2021%2F8%2F30%2F296&amp;data=04%7C01%7Cpankaj.gupta
> %40
> > >
> nxp.com%7C5d6bf460b260415aeda008d9a5f0ffab%7C686ea1d3bc2b4c6fa92cd
> > >
> 99c5c301635%7C0%7C0%7C637723276775699794%7CUnknown%7CTWFpbGZs
> > >
> b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> > > %3D%7C1000&amp;sdata=VxN9MqDFbISptW3OX9XTtZ%2FwQTsEbF6dETxX
> B%2
> > > BHrywg%3D&amp;reserved=0
> > >
> > > drivers/crypto/caam/caamrng.c | 42
> ++++++++++++++++++++++++++++++++---
> > > drivers/crypto/caam/ctrl.c | 13 +++++++++++
> > > drivers/crypto/caam/ctrl.h | 2 ++
> > > 3 files changed, 54 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/crypto/caam/caamrng.c
> > > b/drivers/crypto/caam/caamrng.c index 77d048dfe5d0..2be5584ae591
> > > 100644
> > > --- a/drivers/crypto/caam/caamrng.c
> > > +++ b/drivers/crypto/caam/caamrng.c
> > > @@ -21,6 +21,7 @@
> > > #include "desc_constr.h"
> > > #include "jr.h"
> > > #include "error.h"
> > > +#include "ctrl.h"
> > >
> > > #define CAAM_RNG_MAX_FIFO_STORE_SIZE 16
> > >
> > > @@ -113,6 +114,35 @@ static int caam_rng_read_one(struct device *jrdev,
> > > return err ?: (ret ?: len);
> > > }
> > >
> > > +static void caam_rng_retry_reset(struct caam_rng_ctx *context) {
> > > + struct device *ctrldev = context->ctrldev;
> > > + struct caam_drv_private *ctrlpriv = dev_get_drvdata(ctrldev);
> > > + struct caam_ctrl __iomem *ctrl;
> > > + struct rng4tst __iomem *r4tst;
> > > + u32 __iomem *rtstatus;
> > > + u32 retry_count;
> > > +
> > > + ctrl = (struct caam_ctrl __iomem *)ctrlpriv->ctrl;
> > > + r4tst = &ctrl->r4tst[0];
> > > +
> > > + /*
> > > + * There is unfortunately no member for RTSTATUS register in
> > > + * struct rng4tst and the structure doesn't look stable
> > > + */
> > > + rtstatus = (u32 *)((char *)&ctrl->r4tst[0] + 0x3C);
> > > + retry_count = (rd_reg32(rtstatus) >> 16) & 0xf;
> > > + dev_dbg(ctrldev, "CAAM RNG retry count %d\n", retry_count);
> > > + if (retry_count == 0) {
> > > + dev_err(ctrldev, "CAAM RNG resetting retry count to 1\n");
> > > + clrsetbits_32(&r4tst->rtmctl, 0, RTMCTL_PRGM | RTMCTL_ACC);
> > > + wr_reg32(&r4tst->rtscmisc,
> > > + (rd_reg32(&r4tst->rtscmisc) & 0x7f) | (1 <<
> > > 16));
> > > + clrsetbits_32(&r4tst->rtmctl, RTMCTL_PRGM | RTMCTL_ACC,
> > > + RTMCTL_SAMP_MODE_RAW_ES_SC);
> > > + caam_reinstantiate_rng(ctrldev);
> > > + }
> > > +}
> > > +
> > > static void caam_rng_fill_async(struct caam_rng_ctx *ctx) {
> > > struct scatterlist sg[1];
> > > @@ -129,8 +159,10 @@ static void caam_rng_fill_async(struct
> > > caam_rng_ctx
> > > *ctx)
> > > sg[0].length,
> > > ctx->desc_async,
> > > &done);
> > > - if (len < 0)
> > > + if (len < 0) {
> > > + caam_rng_retry_reset(ctx);
> > > return;
> > > + }
> > >
> > > kfifo_dma_in_finish(&ctx->fifo, len); } @@ -145,13 +177,17
> > > @@ static void caam_rng_worker(struct work_struct *work) static int
> > > caam_read(struct hwrng *rng, void *dst, size_t max, bool wait) {
> > > struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
> > > - int out;
> > > + int out, ret;
> > >
> > > if (wait) {
> > > struct completion done;
> > >
> > > - return caam_rng_read_one(ctx->jrdev, dst, max,
> > > + ret = caam_rng_read_one(ctx->jrdev, dst, max,
> > > ctx->desc_sync, &done);
> > > + if (ret < 0)
> > > + caam_rng_retry_reset(ctx);
> > > +
> > > + return ret;
> > > }
> > >
> > > out = kfifo_out(&ctx->fifo, dst, max); diff --git
> > > a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c index
> > > ca0361b2dbb0..e421f8d1982b 100644
> > > --- a/drivers/crypto/caam/ctrl.c
> > > +++ b/drivers/crypto/caam/ctrl.c
> > > @@ -339,6 +339,19 @@ static int instantiate_rng(struct device
> > > *ctrldev, int state_handle_mask,
> > > return devm_add_action_or_reset(ctrldev,
> > > devm_deinstantiate_rng, ctrldev); }
> > >
> > > +/*
> > > + * caam_reinstantiate_rng - reinstantiates RNG. Intended for a case
> > > +when RNG
> > > falls into
> > > + * HW error condition. That happens if TRNG fails statistical
> > > + * check and RTY_CNT value set in RTSCMISC decrements to
> zero.
> > > + * It is exported to caamrng.c
> > > + * @ctrldev - pointer to device
> > > + */
> > > +
> > > +int caam_reinstantiate_rng(struct device *ctrldev) {
> > > + return instantiate_rng(ctrldev, 0, 0); }
> > > +
> > > /*
> > > * kick_trng - sets the various parameters for enabling the initialization
> > > * of the RNG4 block in CAAM
> > > diff --git a/drivers/crypto/caam/ctrl.h b/drivers/crypto/caam/ctrl.h
> > > index
> > > f3ecd67922a7..26ff5a49a865 100644
> > > --- a/drivers/crypto/caam/ctrl.h
> > > +++ b/drivers/crypto/caam/ctrl.h
> > > @@ -8,6 +8,8 @@
> > > #ifndef CTRL_H
> > > #define CTRL_H
> > >
> > > +int caam_reinstantiate_rng(struct device *ctrldev);
> > > +
> > > /* Prototypes for backend-level services exposed to APIs */ extern
> > > bool caam_dpaa2;
> > >
> > > --
> > > 2.25.1
> >