Re: ALSA: intel8x0: div by zero in snd_intel8x0_update()

From: Takashi Iwai
Date: Wed Jul 07 2021 - 03:02:27 EST


On Tue, 06 Jul 2021 19:50:08 +0200,
Max Filippov wrote:
>
> Hello,
>
> On Sun, May 16, 2021 at 2:50 AM Takashi Iwai <tiwai@xxxxxxx> wrote:
> >
> > On Sun, 16 May 2021 10:31:41 +0200,
> > Sergey Senozhatsky wrote:
> > >
> > > On (21/05/16 17:30), Sergey Senozhatsky wrote:
> > > > On (21/05/14 20:16), Sergey Senozhatsky wrote:
> > > > > > --- a/sound/pci/intel8x0.c
> > > > > > +++ b/sound/pci/intel8x0.c
> > > > > > @@ -691,6 +691,9 @@ static inline void snd_intel8x0_update(struct intel8x0 *chip, struct ichdev *ich
> > > > > > int status, civ, i, step;
> > > > > > int ack = 0;
> > > > > >
> > > > > > + if (!ichdev->substream || ichdev->suspended)
> > > > > > + return;
> > > > > > +
> > > > > > spin_lock_irqsave(&chip->reg_lock, flags);
> > > > > > status = igetbyte(chip, port + ichdev->roff_sr);
> > > > > > civ = igetbyte(chip, port + ICH_REG_OFF_CIV);
> > > >
> > > > This does the problem for me.
> > >
> > > ^^^ does fix
> >
> > OK, thanks for confirmation. So this looks like some spurious
> > interrupt with the unexpected hardware bits.
> >
> > However, the suggested check doesn't seem covering enough, and it
> > might still hit if the suspend/resume happens before the device is
> > opened but not set up (and such a spurious irq is triggered).
> >
> > Below is more comprehensive fix. Let me know if this works, too.
> >
> >
> > thanks,
> >
> > Takashi
> >
> > -- 8< --
> > Subject: [PATCH] ALSA: intel8x0: Don't update period unless prepared
> >
> > The interrupt handler of intel8x0 calls snd_intel8x0_update() whenever
> > the hardware sets the corresponding status bit for each stream. This
> > works fine for most cases as long as the hardware behaves properly.
> > But when the hardware gives a wrong bit set, this leads to a NULL
> > dereference Oops, and reportedly, this seems what happened on a VM.
> >
> > For fixing the crash, this patch adds a internal flag indicating that
> > the stream is ready to be updated, and check it (as well as the flag
> > being in suspended) to ignore such spurious update.
> >
> > Cc: <stable@xxxxxxxxxxxxxxx>
> > Reported-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>
> > Signed-off-by: Takashi Iwai <tiwai@xxxxxxx>
> > ---
> > sound/pci/intel8x0.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
>
> linux v5.13 booting on qemu-system-xtensa virt board gets stuck inside
> snd_intel8x0_probe -> intel8x0_measure_ac97_clock with this patch.
> Prior to it it boots successfully for me.
> I'm curious if this issue has been reported yet.
>
> What I see is an IRQ flood, at some point snd_intel8x0_interrupt
> and timer ISR are called in loop and execution never returns to
> the interrupted function intel8x0_measure_ac97_clock.
>
> Any idea what it could be?

That's something odd with the VM. As the chip itself has never shown
such a problem on real systems, maybe the best action would be to just
skip the clock measurement on VM. The measurement itself is
unreliable on VM, so it makes more sense.

That said, something like below would work?


thanks,

Takashi

---
diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c
index 2d1bfbcba933..b75f832d7777 100644
--- a/sound/pci/intel8x0.c
+++ b/sound/pci/intel8x0.c
@@ -2199,6 +2199,9 @@ static int snd_intel8x0_mixer(struct intel8x0 *chip, int ac97_clock,
pbus->private_free = snd_intel8x0_mixer_free_ac97_bus;
if (ac97_clock >= 8000 && ac97_clock <= 48000)
pbus->clock = ac97_clock;
+ else if (chip->inside_vm)
+ pbus->clock = 48000;
+
/* FIXME: my test board doesn't work well with VRA... */
if (chip->device_type == DEVICE_ALI)
pbus->no_vra = 1;