Re: [PATCH] spi: Add FSI-attached SPI controller driver

From: Eddie James
Date: Mon Feb 10 2020 - 15:05:30 EST



On 2/7/20 4:04 PM, Andy Shevchenko wrote:
On Fri, Feb 7, 2020 at 11:04 PM Eddie James <eajames@xxxxxxxxxxxxxxxxxx> wrote:
On 2/7/20 2:34 PM, Andy Shevchenko wrote:
On Fri, Feb 7, 2020 at 10:04 PM Eddie James <eajames@xxxxxxxxxxxxxxxxxx> wrote:
On 2/7/20 1:39 PM, Andy Shevchenko wrote:
On Fri, Feb 7, 2020 at 9:28 PM Eddie James <eajames@xxxxxxxxxxxxxxxxxx> wrote:
On 2/5/20 9:51 AM, Andy Shevchenko wrote:
On Tue, Feb 4, 2020 at 6:06 PM Eddie James <eajames@xxxxxxxxxxxxx> wrote:
On 2/4/20 5:02 AM, Andy Shevchenko wrote:
On Mon, Feb 3, 2020 at 10:33 PM Eddie James <eajames@xxxxxxxxxxxxxxxxxx> wrote:
On 1/30/20 10:37 AM, Andy Shevchenko wrote:
+ for (i = 0; i < num_bytes; ++i)
+ rx[i] = (u8)((in >> (8 * ((num_bytes - 1) - i))) & 0xffULL);
Redundant & 0xffULL part.
For me it looks like

u8 tmp[8];

put_unaligned_be64(in, tmp);
memcpy(rx, tmp, num_bytes);

put_unaligned*() is just a method to unroll the value to the u8 buffer.
See, for example, linux/unaligned/be_byteshift.h implementation.
Unforunately it is not the same. put_unaligned_be64 will take the
highest 8 bits (0xff00000000000000) and move it into tmp[0]. Then
0x00ff000000000000 into tmp[1], etc. This is only correct for this
driver IF my transfer is 8 bytes. If, for example, I transfer 5 bytes,
then I need 0x000000ff00000000 into tmp[0], 0x00000000ff000000 into
tmp[1], etc. So I think my current implementation is correct.
Yes, I missed correction of the start address in memcpy(). Otherwise
it's still the same what I was talking about.
I see now, yes, thanks.

Do you think this is worth a v3? Perhaps put_unaligned is slightly more
optimized than the loop but there is more memory copy with that way too.
I already forgot the entire context when this has been called. Can you
summarize what the sequence(s) of num_bytes are expected usually.

IIUC if packets small, less than 8 bytes, than num_bytes will be that value.
Otherwise it will be something like 8 + 8 + 8 ... + tail. Is it
correct assumption?

Yes, it will typically be 8 + 8 +... remainder. Basically, on any RX,
the driver polls for the rx register full. Once full, it will read
however much data is left to be transferred. Since we use min(len, 8)
then we read 8 usually, until we get to the end.
I asked that because we might have a better optimization, i.e, call
directly put_unaligned_be64() when we know that length is 8 bytes. For
the rest your approach might be simpler. Similar for the TX case.


I just tried to implement as you suggested but I realized something: The value is already swapped from BE to CPU when the register is read in fsi_spi_read_reg. It happens to work out correctly to use put_unaligned_be64 on a LE CPU to flip the bytes here. But on a BE CPU, this wouldn't be correct I think. Now I don't anticipate this driver running on a BE CPU, but I think it is weird to flip it twice, and better to do it manually here.

What do you think Andy?

Thanks,

Eddie



+ return num_bytes;
+}