Re: [PATCH v2] i2c: designware: Fix corrupted memory seen in the ISR

From: Serge Semin
Date: Mon Sep 25 2023 - 08:54:51 EST


On Wed, Sep 20, 2023 at 12:14:17PM -0700, Jan Bottorff wrote:
> On 9/20/2023 6:27 AM, Yann Sionneau wrote:
> > Hi,
> >
> > On 20/09/2023 11:08, Wolfram Sang wrote:
> > > > same thread." [1] Thus I'd suggest the next fix for the problem:
> > > >
> > > > --- a/drivers/i2c/busses/i2c-designware-common.c
> > > > +++ b/drivers/i2c/busses/i2c-designware-common.c
> > > > @@ -72,7 +72,10 @@ static int dw_reg_write(void *context,
> > > > unsigned int reg, unsigned int val)
> > > >   {
> > > >       struct dw_i2c_dev *dev = context;
> > > > -    writel_relaxed(val, dev->base + reg);
> > > > +    if (reg == DW_IC_INTR_MASK)
> > > > +        writel(val, dev->base + reg);
> > > > +    else
> > > > +        writel_relaxed(val, dev->base + reg);
> > > >       return 0;
> > > >   }
> > > >
> > > > (and similar changes for dw_reg_write_swab() and dw_reg_write_word().)
> > > >
> > > > What do you think?
> > > To me, this looks reasonable and much more what I would have expected as
> > > a result (from a high level point of view). Let's hope it works. I am
> > > optimistic, though...
> > >
> > It works if we make sure all the other register accesses to the
> > designware i2c IP can't generate IRQ.
> >
> > Meaning that all register accesses that can trigger an IRQ are enclosed
> > in between a call to i2c_dw_disable_int() and a call to
> > regmap_write(dev->map, DW_IC_INTR_MASK, DW_IC_INTR_MASTER_MASK); or
> > equivalent.
> >
> > It seems to be the case, I'm not sure what's the best way to make sure
> > it will stay that way.
> >
> > Moreover, maybe writes to IC_ENABLE register should also use the
> > non-relaxed writel() version?
> >
> > Since one could do something like:
> >
> > [ IP is currently disabled ]
> >
> > 1/ enable interrupts in DW_IC_INTR_MASK
> >
> > 2/ update some variable in dev-> structure in DDR
> >
> > 3/ enable the device by writing to IC_ENABLE, thus triggering for
> > instance the TX_FIFO_EMPTY irq.
> >
>
> It does seem like there are a variety of register write combinations that
> could immediately cause an interrupt, so would need a barrier.

My suggestion was based on your fix. If it won't work or if it won't
completely solve the problem, then perhaps one of the next option
shall do it:
1. Add the non-relaxed IO call for the IC_ENABLE CSR too.
2. Completely convert the IO accessors to using the non-relaxed
methods especially seeing Wolfram already noted: "Again, I am all with
Catalin here. Safety first, optimizations a la *_relaxed should be
opt-in."
https://lore.kernel.org/linux-i2c/ZQm2Ydt%2F0jRW4crK@shikoro/
3. Find all the places where the memory writes need to be fully
visible after a subsequent IO-write causing an IRQ raise and just
place dma_wmb() there (though just wmb() would look a bit more
relevant).

IMO in the worst case solution 2. must be enough at least in the
master mode seeing the ISR uses the completion variable to indicate
the cmd execution completion, which also implies the complete memory
barrier. Moreover i2c bus isn't that performant for us to be that much
concerned about the optimizations like the pipeline stalls in between
the MMIO accesses.

-Serge(y)

>
> I understand barriers a bit better now, although still wonder about some
> cases. Like say you write to some driver memory and then write the DW
> command fifo register, and it does not immediately cause an interrupt, but
> will in the future. Even without the barrier the memory write would
> typically become visible to other cores after some small amount of time, but
> don't see that's it's architecturally guaranteed to be visible. This implies
> the barrier is perhaps needed on many/all of the registers.
>
> Jan
>
>