Re: 2.1.95 lockups

Linus Torvalds (torvalds@transmeta.com)
11 Apr 1998 22:37:41 GMT


In article <13615.48833.506694.884880@scrye.com>,
Kevin Fenzi <kevin@scrye.com> wrote:
>
>[ I am cc'ing the author of xcdroast...it might need a change for post
> 2.1.95 kernels ]
>
>This is likely un-related to the other lockups people have been
>reporting, but I can reliably lock up my dual p166 machine under
>2.1.95 by running xcdroast. ;(

The SCSI ioctl's are _not_ currently protected by the SMP lock: we're
still fixing some basic issues with normal IO (timeouts and bus resets).
Any unprotected region that calls into code that expects to be protected
is likely to get _very_ upset (because it will end up doing bad things
to the lock that it thought was held).

>2.1.94 works fine (well, I didn't burn a cd, but it came up fine).
>
>I also have a report from a friend the same thing is happening on
>their dual ppro 166.
>
>No oops, no nothing, just a hard lockup. Nothing ever comes up so it's
>likely when xcdroast is scanning for devices and whatnot.

It probably works fine on UP, because the lock goes away completely on
UP.

>I do have the generic scsi support compiled as a module, so this could
>be a kmod problem.

No, it's a generic SCSI problem right now. The sg code simply doesn't
do the right thing wrt the io_request_lock, and what ends up happening
is that something tries to lock the spinlock twice.

(The code is _really_ broken: it tries to keep the spinlock across a
sleep, for example).

Right now my #1 priority is to make sure that the basic disk operations
are safe. There's some work to be done for that still (timeouts are
currently not getting the lock correctly, so if you have a SCSI command
that gets lost for some reason the machine will lock on SMP). But it
_looks_ like fixing that should be fairly simple, and after that I'll
look into the sg.c code.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu