Re: [PATCH v9 6/9] i3c: master: Add driver for Cadence IP

From: Boris Brezillon
Date: Fri Oct 26 2018 - 08:47:03 EST


On Fri, 26 Oct 2018 12:01:52 +0200
Arnd Bergmann <arnd@xxxxxxxx> wrote:

> On Fri, Oct 26, 2018 at 9:57 AM Boris Brezillon
> <boris.brezillon@xxxxxxxxxxx> wrote:
> > On Fri, 26 Oct 2018 09:43:25 +0200
> > Arnd Bergmann <arnd@xxxxxxxx> wrote:
> >
> > > On Thu, Oct 25, 2018 at 6:30 PM Boris Brezillon
> > > <boris.brezillon@xxxxxxxxxxx> wrote:
> > > > On Thu, 25 Oct 2018 18:13:51 +0200 Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > > On Thu, Oct 25, 2018 at 6:07 PM Boris Brezillon <boris.brezillon@xxxxxxxxxxx> wrote:
> > > > > > On Thu, 25 Oct 2018 17:30:26 +0200
> > > Ok. Is i3c_master_send_ccc_cmd_locked() what implements the public
> > > interfaces then, or is this something else?
> >
> > i3c_master_send_ccc_cmd_locked() calls master->ops->send_ccc_cmd(), so
> > it's part of the master controller interface.
> >
> > >
> > > If you place a buffer on the stack, it is not DMA capable, but
> > > it is guaranteed to be at least 32-bit word aligned, and should
> > > not cause an exception in readsl(), unless it starts with a couple of
> > > (not multiple of four) extra bytes that are not sent to the devices.
> > > Is that what happens here?
> >
> > Here is the report I received from Vitor:
> >
> > "
> > Hi Boris,
> >
> >
> > I'm trying this new patch-set version but I get some issues when use
> > readsl() function.
> >
> > Basically the system complain about memory alignment.
> >
>
> > > +static int i3c_master_getpid_locked(struct i3c_master_controller *master,
> > > + struct i3c_device_info *info)
> > > +{
> > > + struct i3c_ccc_getpid getpid;
> >
> > at this point the getpid struct it is already unaligned with
> >
> > i3c_master_getpid_locked:1129 getpid_add=0x9a249c7a
> >
> > > + struct i3c_ccc_cmd_dest dest = {
> > > + .addr = info->dyn_addr,
> > > + .payload.len = sizeof(struct i3c_ccc_getpid),
> > > + .payload.data = &getpid,
> > > + };
>
> > > +}
> > > +
> >
> > and them when
> >
> > static void dw_i3c_master_read_rx_fifo(struct dw_i3c_master *master,
> > u8 *bytes, int nbytes)
> > {
> > readsl(master->regs + RX_TX_DATA_PORT, bytes, nbytes / 4);
> > ...
> > }
>
> Ok, I spent an hour chasing the ARM implementation and finding
> no way this could go wrong here. I see that 'struct i3c_ccc_getpid'
> may be misaligned on the stack (it normally won't be), and that
> the ARM readsl() has a lot of extra code to handle unaligned
> output.

I didn't have this problem on xtensa either.

> However, the dump that Vitor reports
>
> > [ECR ]: 0x00230400 => Misaligned r/w from 0x9a249c7a
> > [EFA ]: 0x9a249c7a
> > [BLINK ]: dw_i3c_master_irq_handler+0x200/0x2fc [dw_i3c_master]
>
> Is from an arch/arc kernel that uses asm-generic/io.h, and
> that stores the output using a u32 pointer:
>
> static inline void readsl(const volatile void __iomem *addr, void *buffer,
> unsigned int count)
> {
> if (count) {
> u32 *buf = buffer;
>
> do {
> u32 x = __raw_readl(addr);
> *buf++ = x;
> } while (--count);
> }
> }
>
> This is apparently not allowed on ARC when 'buffer' is
> unaligned. I think what we need here is to use
> put_unaligned() instead of the pointer dereference.
> For architectures that can do unaligned accesses,
> the result is the same, but for ARC it will fix the problem.

Okay, so writesl()/readsl() should deal with unaligned pointers, and
default implementations should be fixed. I guess you'll send a patch to
use put/get_unaligned().

>
> > > One way to address this might be to always bounce any
> > > messages that are less than a cache line through a
> > > (pre-)kmallocated buffer, and require any longer messages
> > > to be cache capable. This could also solve the issue with
> > > readsl(), but it would be a rather confusing user interface.
> > >
> > > Another option might be to have separate interfaces for
> > > "short" and "long" messages at the API level and have
> > > distinct rules for those: short would always be bounced
> > > by the i3c code, and long puts restrictions on the buffer
> > > location.
> >
> > Hm, let's keep the API simple. I'll just mandate that all payload bufs
> > passed to i3c_master_send_ccc_cmd_locked() be dynamically allocated.
>
> Ok. What about i2c commands sent to the same i3c controller
> then?

Still not taken care of.

> Do we need to copy those to satisfy the requirements
> of the i3c layer?

I guess we should. The question is, should we do that unconditionally
or should we try to optimize thins with something like:

if (!virt_addr_valid(xfer->buf) ||
object_is_on_stack(xfer->buf))
/* Alloc bounce buf. */
else
/* Use provided buf. */