Re: [PATCH v2] iio: adc: ad7944: Add support for "3-wire mode"
From: Jonathan Cameron
Date: Mon Mar 18 2024 - 10:29:37 EST
On Mon, 18 Mar 2024 15:09:32 +0200
Andy Shevchenko <andy.shevchenko@xxxxxxxxx> wrote:
> On Mon, Mar 18, 2024 at 2:41 PM Jonathan Cameron
> <Jonathan.Cameron@xxxxxxxxxx> wrote:
> > > > struct ad7944_adc {
> > > > struct spi_device *spi;
> > > > + enum ad7944_spi_mode spi_mode;
> > > > /* Chip-specific timing specifications. */
> > > > const struct ad7944_timing_spec *timing_spec;
> > > > /* GPIO connected to CNV pin. */
> > > > @@ -58,6 +75,9 @@ struct ad7944_adc {
> > > > } sample __aligned(IIO_DMA_MINALIGN);
> > > > };
> > >
> > > Have you run `pahole` to see if there is a better place for a new member?
> >
> > I know this matters for structures where we see lots of them, but do we actually
> > care for one offs? Whilst it doesn't matter here I'd focus much more
> > on readability and like parameter grouping for cases like this than wasting
> > a few bytes.
>
> This is _also_ true, but think more about cache line contamination.
> Even not-so-important bytes may decrease the performance. In some
> cases it's tolerable, in some it is not (high-speed ADC). In general I
> assume that the developer has to understand many aspects of the
> software and cache line contamination may be last but definitely not
> least.
>
Not totally sure what you are covering with contamination as many aspects
around caches and that's not really a standard term for any of them (as
far as I know).
It's part of a multi cacheline allocation anyway (because it's tacked on the
end of the iio device struct, so fairly unlikely to share with other allocations
and definitely not on ARM because of the trailing __aligned(IIO_DMA_MINALIGN)
elements.
If it matters more locally, then pahole is more likely to push you to pack
things together in a fashion that makes false sharing and similar perf issues
more likely if you are grouping things for packing purposes rather than
logical groups.
If you just mean cache pressure then fair enough if we squeeze everything into
one cacheline and that doesn't cause false sharing.
'Maybe' this will fit on x86. On Arm64 it's not going to
make any difference, just moving the padding around a bit within the line.
So I'd argue premature optimization for a small, one off, structure.
Jonathan