Re: [PATCH] i2c: i2c-qcom-geni: Fix DMA transfer race

From: Stephen Boyd
Date: Tue Jul 21 2020 - 14:56:12 EST


Quoting Doug Anderson (2020-07-21 09:18:35)
> On Tue, Jul 21, 2020 at 12:08 AM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> >
> > Quoting Stephen Boyd (2020-07-20 22:59:14)
> > >
> > > I worry that we also need a dmb() here to make sure the dma buffer is
> > > properly mapped before this write to the device is attempted. But it may
> > > only matter to be before the I2C_READ.
> > >
> >
> > I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> > use a writel() so that it has the proper barrier semantics to wait for
> > the other memory writes that happened in program order before this point
> > to complete before the device is kicked to do a read or a write.
>
> Are you saying that dma_map_single() isn't guaranteed to have a
> barrier or something? I tried to do some searching and found a thread
> [1] where someone tried to add a barrierless variant of them. To me
> that means that the current APIs have barriers.
>
> ...or is there something else you're worried about?

I'm not really thinking about dma_map_single() having a barrier or not.
The patch you mention is from 2010. Many things have changed in the last
decade. Does it have barrier semantics? The presence of a patch on the
mailing list doesn't mean much.

Specifically I'm looking at "KERNEL I/O BARRIER EFFECTS" of
Documentation/memory-barriers.txt and noticing that this driver is using
relaxed IO accessors meaning that the reads and writes aren't ordered
with respect to other memory accesses. They're only ordered to
themselves within the same device. I'm concerned that the CPU will issue
the IO access to start the write DMA operation before the buffer is
copied over due to out of order execution.

I'm not an expert in this area, but this is why we ask driver authors to
use the non-relaxed accessors because they have the appropriate
semantics built in to make them easy to reason about. They do what they
say when they say to do it.

>
>
> > ----8<----
> > diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> > index 18d1e4fd4cf3..7f130829bf01 100644
> > --- a/drivers/i2c/busses/i2c-qcom-geni.c
> > +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> > @@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> > geni_se_select_mode(se, GENI_SE_FIFO);
> >
> > writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> > - geni_se_setup_m_cmd(se, I2C_READ, m_param);
> >
> > if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
> > geni_se_select_mode(se, GENI_SE_FIFO);
> > @@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> > dma_buf = NULL;
> > }
> >
> > + geni_se_setup_m_cmd(se, I2C_READ, m_param);
>
> I guess it's true that we only need the setup_m_cmd moved.

Alright cool. That makes more sense.

>
>
> > +
> > time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> > if (!time_left)
> > geni_i2c_abort_xfer(gi2c);
> > @@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> > geni_se_select_mode(se, GENI_SE_FIFO);
> >
> > writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
> > - geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> >
> > if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
> > geni_se_select_mode(se, GENI_SE_FIFO);
> > @@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> > dma_buf = NULL;
> > }
> >
> > + geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> > +
>
> True, it's probably safer to do the TX too even if I'm not seeing
> problems there. Of course, I don't think I'm doing any large writes
> so probably never triggering this path anyway.

Right, this is just by inspection of the code to see that it's the same
scenario, kicking off the DMA operation at the device before mapping the
buffer.

>
>
> > if (!dma_buf) /* Get FIFO IRQ */
> > writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);
> >