Re: Intel QAT on A2SDi-8C-HLN4F causes massive data corruption with dm-crypt + xfs
From: Giovanni Cabiddu
Date: Fri Mar 04 2022 - 12:50:52 EST
On Thu, Mar 03, 2022 at 09:44:53PM +0000, Eric Biggers wrote:
> On Thu, Mar 03, 2022 at 09:24:42PM +0000, Giovanni Cabiddu wrote:
> > On Thu, Mar 03, 2022 at 07:21:33PM +0000, Eric Biggers wrote:
> > > If these algorithms have critical bugs, which it appears they do, then IMO it
> > > would be better to disable them (either stop registering them, or disable the
> > > whole driver) than to leave them available with low cra_priority. Low
> > > cra_priority doesn't guarantee that they aren't used.
> > Thanks for your feedback Eric.
> >
> > Here is a patch that disables the registration of the algorithms in the
> > QAT driver by setting, a config time, the number of HW queues (aka
> > instances) to zero.
> >
> > ---8<---
> > From: Giovanni Cabiddu <giovanni.cabiddu@xxxxxxxxx>
> > Subject: [PATCH] crypto: qat - disable registration of algorithms
> > Organization: Intel Research and Development Ireland Ltd - Co. Reg. #308263 - Collinstown Industrial Park, Leixlip, County Kildare - Ireland
> >
> > The implementations of aead and skcipher in the QAT driver do not
> > support properly requests with the CRYPTO_TFM_REQ_MAY_BACKLOG flag set.
> > If the HW queue is full, the driver returns -EBUSY but does not enqueue
> > the request.
> > This can result in applications like dm-crypt waiting indefinitely for a
> > completion of a request that was never submitted to the hardware.
> >
> > To avoid this problem, disable the registration of all skcipher and aead
> > implementations in the QAT driver by setting the number of crypto
> > instances to 0 at configuration time.
> >
> > This patch deviates from the original upstream solution, that prevents
> > dm-crypt to use drivers registered with the flag
> > CRYPTO_ALG_ALLOCATES_MEMORY, since a backport of that set to stable
> > kernels may have a too wide effect.
> >
> > commit 7bcb2c99f8ed032cfb3f5596b4dccac6b1f501df upstream
> > commit 2eb27c11937ee9984c04b75d213a737291c5f58c upstream
> > commit fbb6cda44190d72aa5199d728797aabc6d2ed816 upstream
> > commit b8aa7dc5c7535f9abfca4bceb0ade9ee10cf5f54 upstream
> > commit cd74693870fb748d812867ba49af733d689a3604 upstream
> >
> > Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@xxxxxxxxx>
> > ---
> > drivers/crypto/qat/qat_common/qat_crypto.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
>
> Sounds good; is there any reason not to apply this upstream too, though?
> You could revert it later as part of the patch series that fixes the driver.
Makes sense. I'm going to send it upstream and Cc stable as documented
in https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html#option-1
I will then revert this change in the set that fixes the problem.
Thanks,
--
Giovanni