Re: More data on ide-cd "playing music" death

Jens Axboe (
Tue, 01 Dec 1998 17:22:38 +0100

Romano Giannetti wrote:

> On Mon, Nov 30, 1998 at 10:51:23PM +0100, Jens Axboe wrote:
> > "Andre M. Hedrick" wrote:
> > > I can/have reproduce the report and was about to bug you also.
> > > A little more datails........there are only problems if the drive tries to
> > > run in UDMA mode period........this may also react the same way if DMA is
> > > enabled regardless of mode...........
> >
> > Alright, I hadn't noticed this.
> Sorry... but I run with hdc=noautotune and then I do a hdparm -u1 -d0
> /dev/hdc in boot scripts. Look:
> /dev/hdc:
> HDIO_GET_MULTCOUNT failed: Invalid argument
> I/O support = 0 (default 16-bit)
> unmaskirq = 1 (on)
> using_dma = 0 (off)
> keepsettings = 0 (off)
> HDIO_GET_NOWERR failed: Invalid argument
> readonly = 1 (on)
> readahead = 8 (on)
> HDIO_GETGEO failed: Invalid argument

I'll let Andre comment on this...

> BAD NEWS, Jens: with debug=0 I can trigger the bug... well, I think. I
> was listening music and the PC freezed completely. Nothing in the log.
> Black screen (under X), no console switching, not Alt SysRq s. Only
> Alt SysRq b did work and reboot the system.
> It seems that the reset hosed up completely the I/O.

Do you have anything else on the secondary controller? Did the system
respons to any of the other sysrq's, specifically sysrq+p? This might
indicate that Andre's suggestion regarding variable timeouts would
be useful. I hope to get my hands on a "newer" CD-ROM so I can do
some testing on my own.

> Ah. A thing I noticed yesterday. It seems that, when I triggered the
> bug (plain 2.1.130), the programs locked in the D state were in
> down_failed or something like this. I guess that the problem is that
> when we loose an interrupt, we should notify it and try to continue;
> but with the original code after that every time we see another
> interrupt we report a lost one. Is it possible (wild guess) that
> simply we forgot to increment/decrement/unlock something?

Andre, what do we currently do with a lost interrupt?

