Re: [syzbot] KASAN: out-of-bounds Read in i801_isr
From: Andy Shevchenko
Date: Mon May 24 2021 - 06:42:15 EST
On Mon, May 24, 2021 at 11:16:52AM +0200, Jean Delvare wrote:
> On Fri, 21 May 2021 15:48:20 +0200, Jean Delvare wrote:
> > Looking at the ICH9 datasheet, I see the following description for the
> > KILL bit (which is what we try to use to reset the SMBus controller):
> >
> > "Kills the current host transaction taking place, sets the FAILED
> > status bit, and asserts the interrupt (or SMI#)."
> >
> > At the time the recovery code was written, i2c-i801 was a polling-only
> > driver, interrupts were not supported, so asserting the interrupt had
> > no effect. Now that the driver does support interrupts, this would call
> > i801_isr(), right?
>
> I modified the i2c-i801 driver to simulate a timeout once in a while,
> and I can confirm that i801_isr() gets called when the KILL bit is set.
>
> > So my theory is that our attempt to kill a timed-out byte-by-byte block
> > transaction triggers an interrupt, which calls in i801_isr() with the
> > SMBHSTSTS_BYTE_DONE bit set. This in turn causes i801_isr_byte_done()
> > to be called while we are absolutely not ready nor even supposed to
> > process the next data byte.
>
> I can also report that I'm reproducing the bug reported by syzbot when
> this happens, by running decode-dimms with my modified i2c-i801 driver.
> I have 4 SPD EEPROMs instantiated for my DIMMs, so decode-dimms
> triggers a lot of calls to i2c_smbus_read_i2c_block_data().
>
> > I guess we should clear SMBHSTSTS_BYTE_DONE before issuing a
> > SMBHSTCNT_KILL. Alternatively we could add a check at the beginning of
> > i801_isr() to bail out immediately if SMBHSTCNT_KILL is set. While
> > possibly more robust, this approach has the drawback of increasing the
> > processing time of all interrupts, even standard/legitimate ones. So
> > maybe just clearing SMBHSTSTS_BYTE_DONE is more reasonable. Something
> > like:
> >
> > --- linux-5.11.orig/drivers/i2c/busses/i2c-i801.c
> > +++ linux-5.11/drivers/i2c/busses/i2c-i801.c
> > @@ -393,6 +393,8 @@ static int i801_check_post(struct i801_p
> > dev_err(&priv->pci_dev->dev, "Transaction timeout\n");
> > /* try to stop the current command */
> > dev_dbg(&priv->pci_dev->dev, "Terminating the current operation\n");
> > + /* Clear BYTE_DONE so as to not confuse i801_isr() */
> > + outb_p(SMBHSTSTS_BYTE_DONE, SMBHSTSTS(priv));
> > outb_p(inb_p(SMBHSTCNT(priv)) | SMBHSTCNT_KILL,
> > SMBHSTCNT(priv));
> > usleep_range(1000, 2000);
>
> I ended up with a different approach. Clearing BYTE_DONE would possibly
> avoid the problem immediately at hand, however the interrupt would
> still be triggered, even for other (not byte-by-byte) transactions,
> causing us to fiddle with the status register. I don't think it's
> needed, and it could have unintended side effects.
>
> So instead I tried the alternative approach of checking the KILL bit in
> i801_isr() and returning immediately if it is set. While it does work,
> I noticed that i801_isr() is in fact called in a loop for the whole
> duration of the usleep_range(1000, 2000) before we clear the KILL bit.
> While better than an out-of-bounds memory access, an interrupt flood is
> still not ideal.
>
> > I must say I wonder why SMBHSTCNT_KILL generates an interrupt in the
> > first place, I can't see who would need this.
>
> Which made me think... The interrupt is being generated because we
> *ask* for it. It doesn't have to be generated if we aren't interested
> in it. So the fix would be very simple: don't bother preserving the
> other bits in HST_CNT, they will be set properly again for the next
> transaction anyway, only set the KILL bit to 1. In particular don't
> preserve the INTREN bit, so no interrupt will be generated.
>
> @@ -395,11 +401,9 @@ static int i801_check_post(struct i801_p
> dev_err(&priv->pci_dev->dev, "Transaction timeout\n");
> /* try to stop the current command */
> dev_dbg(&priv->pci_dev->dev, "Terminating the current operation\n");
> - outb_p(inb_p(SMBHSTCNT(priv)) | SMBHSTCNT_KILL,
> - SMBHSTCNT(priv));
> + outb_p(SMBHSTCNT_KILL, SMBHSTCNT(priv));
> usleep_range(1000, 2000);
> - outb_p(inb_p(SMBHSTCNT(priv)) & (~SMBHSTCNT_KILL),
> - SMBHSTCNT(priv));
> + outb_p(0, SMBHSTCNT(priv));
>
> /* Check if it worked */
> status = inb_p(SMBHSTSTS(priv));
>
> This works for me. If there are no objections I'll post a proper patch.
No objection from me, thanks for conducting a research and developing the fix!
Feel free to add mine
Acked-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
to whatever solution you consider the best.
--
With Best Regards,
Andy Shevchenko