Re: [tpmdd-devel] [PATCH] tpm: fix cacheline alignment for DMA-able buffers

From: Jarkko Sakkinen
Date: Wed Aug 10 2016 - 16:38:34 EST


On Tue, Aug 09, 2016 at 08:18:00AM -0700, Dmitry Torokhov wrote:
> On Tue, Aug 9, 2016 at 8:01 AM, Jarkko Sakkinen
> <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Aug 09, 2016 at 12:46:10PM +0300, Jarkko Sakkinen wrote:
> > On Fri, Jul 29, 2016 at 10:30:22AM -0700, Dmitry Torokhov wrote:
> > >    On Fri, Jul 29, 2016 at 10:27 AM, Jason Gunthorpe
> > >    <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > >      On Thu, Jul 28, 2016 at 07:59:13PM -0700, Andrey Pronin
> wrote:
> > >      > Annotate buffers used in spi transactions as
> ____cacheline_aligned
> > >      > to use in DMA transfers.
> > >      >
> > >      > Signed-off-by: Andrey Pronin <apronin@xxxxxxxxxxxx>
> > >      >  drivers/char/tpm/st33zp24/spi.c | 4 ++--
> > >      >  drivers/char/tpm/tpm_tis_spi.c  | 4 ++--
> > >      >  2 files changed, 4 insertions(+), 4 deletions(-)
> > >      >
> > >      > diff --git a/drivers/char/tpm/st33zp24/spi.c
> > >      b/drivers/char/tpm/st33zp24/spi.c
> > >      > index 9f5a011..0e9aad9 100644
> > >      > +++ b/drivers/char/tpm/st33zp24/spi.c
> > >      > @@ -70,8 +70,8 @@
> > >      >  struct st33zp24_spi_phy {
> > >      >       struct spi_device *spi_device;
> > >      >
> > >      > -     u8 tx_buf[ST33ZP24_SPI_BUFFER_SIZE];
> > >      > -     u8 rx_buf[ST33ZP24_SPI_BUFFER_SIZE];
> > >      > +     u8 tx_buf[ST33ZP24_SPI_BUFFER_SIZE]
> ____cacheline_aligned;
> > >      > +     u8 rx_buf[ST33ZP24_SPI_BUFFER_SIZE]
> ____cacheline_aligned;
> > >      >
> > >      >       int io_lpcpd;
> > >      >       int latency;
> > >
> > >      Hurm, this still looks wrong to me. Aligning the start of
> buffers is
> > >      not enough, the DMA'able space must also end on a cache line
> as well.
> > >
> > >      So, the buffers must also always be placed at the end of the
> struct.
> > >
> > >      IMHO It would be cleaner and safer to always kmalloc the DMA
> buffer
> > >      alone than to try and optimize like this.
> > >
> > >    In this case moving them to the end of the structure and
> commenting why
> > >    they have to be at the end might be less invasive change. More
> > >    performance-efficient and resilient in low memory situations
> too.
> >
> > kmallocs would be done in the driver initialization:
> >
> > * you rarely are in low memory situation
> > * performance gain/loss is insignificant
> >
> > I really don't see your point.
>
> I'm fine having them at the end of the structure mainly for simplicity
> reasons but those arguments just didn't hold at all.
>
> Well, the main reason was simplicity and invasiveness of the change.
> But I still maintain that doing 3 memory allocations instead of 1 is less
> performant and puts more pressure on the kernel. Yes, it is at bind time,
> but you do not have to do 3 times work when one allocation will suffice.
> Also, driver binding does not necessarily happen at boot time. I can
> always unbind and rebind the driver or reload the module.

I'm fine with either approach.

> Thanks,
> Dmitry

/Jarkko