Re: [PATCH RFC v11 00/12] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O

From: Bartosz Golaszewski

Date: Thu Mar 05 2026 - 08:15:10 EST


On Thu, Mar 5, 2026 at 1:00 PM Stephan Gerhold
<stephan.gerhold@xxxxxxxxxx> wrote:
>
> On Tue, Mar 03, 2026 at 06:13:56PM +0530, Manivannan Sadhasivam wrote:
> > On Mon, Mar 02, 2026 at 04:57:13PM +0100, Bartosz Golaszewski wrote:
> > > NOTE: Please note that even though this is version 11, I changed the
> > > prefix to RFC as this is an entirely new approach resulting from
> > > discussions under v9. I AM AWARE of the existing memory leaks in the
> > > last patch of this series - I'm sending it because I want to first
> > > discuss the approach and get a green light from Vinod as well as Mani
> > > and Bjorn. Especially when it comes to communicating the address for the
> > > dummy rights from the client to the BAM driver.
> > > /NOTE
> > >
> > > Currently the QCE crypto driver accesses the crypto engine registers
> > > directly via CPU. Trust Zone may perform crypto operations simultaneously
> > > resulting in a race condition. To remedy that, let's introduce support
> > > for BAM locking/unlocking to the driver. The BAM driver will now wrap
> > > any existing issued descriptor chains with additional descriptors
> > > performing the locking when the client starts the transaction
> > > (dmaengine_issue_pending()). The client wanting to profit from locking
> > > needs to switch to performing register I/O over DMA and communicate the
> > > address to which to perform the dummy writes via a call to
> > > dmaengine_slave_config().
> > >
> >
> > Thanks for moving the LOCK/UNLOCK bits out of client to the BAM driver. It looks
> > neat now. I understand the limitation that for LOCK/UNLOCK, BAM needs to perform
> > a dummy write to an address in the client register space. So in this case, you
> > can also use the previous metadata approach to pass the scratchpad register to
> > the BAM driver from clients. The BAM driver can use this register to perform
> > LOCK/UNLOCK.
> >
> > It may sound like I'm suggesting a part of your previous design, but it fits the
> > design more cleanly IMO. The BAM performs LOCK/UNLOCK on its own, but it gets
> > the scratchpad register address from the clients through the metadata once.
> >
> > It is very unfortunate that the IP doesn't accept '0' address for LOCK/UNLOCK or
> > some of them cannot append LOCK/UNLOCK to the actual CMD descriptors passed from
> > the clients. These would've made the code/design even more cleaner.
> >
>
> I was staring at the downstream drivers for QCE (qce50.c?) [1] for a bit
> and my impression is that they manage to get along without dummy writes.
> It's a big mess, but it looks like they always have some commands
> (depending on the crypto operation) that they are sending anyway and
> they just assign the LOCK/UNLOCK flag to the command descriptor of that.
>
> It is similar for the second relevant user of the LOCK/UNLOCK flags, the
> QPIC NAND driver (msm_qpic_nand.c in downstream [2], qcom_nandc.c in
> mainline), it is assigned as part of the register programming sequence
> instead of using a dummy write. In addition, the UNLOCK flag is
> sometimes assigned to a READ command descriptor rather than a WRITE.
>
> @Bartosz: Can we get by without doing any dummy writes?
> If not, would a dummy read perhaps be less intrusive than a dummy write?
>

The HPG says that the LOCK/UNLOCK flag *must* be set on a command
descriptor, not a data descriptor. For a simple encryption we will
typically have a data descriptor and a command descriptor with
register writes. So we need a command descriptor in front of the data
and - while we could technically set the UNLOCK bit on the subsequent
command descriptor - it's unclear from the HPG whether it will unlock
before or after processing the command descriptor with the UNLOCK bit
set. Hence the additional command descriptor at the end.

The HPG also only mentions a write command and says nothing about a
read. In any case: that's the least of the problems as switching to
read doesn't solve the issue of passing the address of the scratchpad
register.

So while some of this *may* just work, I would prefer to stick to what
documentation says *will* work. :)

Bartosz