Re: [tpmdd-devel] [PATCH] tpm: fix cacheline alignment for DMA-able buffers
From: Jarkko Sakkinen
Date: Wed Aug 10 2016 - 16:38:34 EST
On Tue, Aug 09, 2016 at 08:18:00AM -0700, Dmitry Torokhov wrote:
> On Tue, Aug 9, 2016 at 8:01 AM, Jarkko Sakkinen
> <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Aug 09, 2016 at 12:46:10PM +0300, Jarkko Sakkinen wrote:
> > On Fri, Jul 29, 2016 at 10:30:22AM -0700, Dmitry Torokhov wrote:
> > >Â Â On Fri, Jul 29, 2016 at 10:27 AM, Jason Gunthorpe
> > >Â Â <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > >Â Â Â On Thu, Jul 28, 2016 at 07:59:13PM -0700, Andrey Pronin
> wrote:
> > >Â Â Â > Annotate buffers used in spi transactions as
> ____cacheline_aligned
> > >Â Â Â > to use in DMA transfers.
> > >Â Â Â >
> > >Â Â Â > Signed-off-by: Andrey Pronin <apronin@xxxxxxxxxxxx>
> > >Â Â Â >Â drivers/char/tpm/st33zp24/spi.c | 4 ++--
> > >   > drivers/char/tpm/tpm_tis_spi.c | 4 ++--
> > >Â Â Â >Â 2 files changed, 4 insertions(+), 4 deletions(-)
> > >Â Â Â >
> > >Â Â Â > diff --git a/drivers/char/tpm/st33zp24/spi.c
> > >Â Â Â b/drivers/char/tpm/st33zp24/spi.c
> > >Â Â Â > index 9f5a011..0e9aad9 100644
> > >Â Â Â > +++ b/drivers/char/tpm/st33zp24/spi.c
> > >Â Â Â > @@ -70,8 +70,8 @@
> > >Â Â Â >Â struct st33zp24_spi_phy {
> > >Â Â Â >Â Â Â Â struct spi_device *spi_device;
> > >Â Â Â >
> > >Â Â Â > -Â Â Â u8 tx_buf[ST33ZP24_SPI_BUFFER_SIZE];
> > >Â Â Â > -Â Â Â u8 rx_buf[ST33ZP24_SPI_BUFFER_SIZE];
> > >Â Â Â > +Â Â Â u8 tx_buf[ST33ZP24_SPI_BUFFER_SIZE]
> ____cacheline_aligned;
> > >Â Â Â > +Â Â Â u8 rx_buf[ST33ZP24_SPI_BUFFER_SIZE]
> ____cacheline_aligned;
> > >Â Â Â >
> > >Â Â Â >Â Â Â Â int io_lpcpd;
> > >Â Â Â >Â Â Â Â int latency;
> > >
> > >Â Â Â Hurm, this still looks wrong to me. Aligning the start of
> buffers is
> > >Â Â Â not enough, the DMA'able space must also end on a cache line
> as well.
> > >
> > >Â Â Â So, the buffers must also always be placed at the end of the
> struct.
> > >
> > >Â Â Â IMHO It would be cleaner and safer to always kmalloc the DMA
> buffer
> > >Â Â Â alone than to try and optimize like this.
> > >
> > >Â Â In this case moving them to the end of the structure and
> commenting why
> > >Â Â they have to be at the end might be less invasive change. More
> > >Â Â performance-efficient and resilient in low memory situations
> too.
> >
> > kmallocs would be done in the driver initialization:
> >
> > * you rarely are in low memory situation
> > * performance gain/loss is insignificant
> >
> > I really don't see your point.
>
> I'm fine having them at the end of the structure mainly for simplicity
> reasons but those arguments just didn't hold at all.
>
> Well, the main reason was simplicity and invasiveness of the change.
> But I still maintain that doing 3 memory allocations instead of 1 is less
> performant and puts more pressure on the kernel. Yes, it is at bind time,
> but you do not have to do 3 times work when one allocation will suffice.
> Also, driver binding does not necessarily happen at boot time. I can
> always unbind and rebind the driver or reload the module.
I'm fine with either approach.
> Thanks,
> Dmitry
/Jarkko