Re: 2.6.17-rc6-mm1/pktcdvd - BUG: possible circular locking

From: Peter Osterlund
Date: Fri Jul 14 2006 - 07:22:56 EST


Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> writes:

> On Thu, 2006-06-22 at 16:50 +0200, Peter Osterlund wrote:
> > Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> writes:
> >
> > > Laurent Riffard wrote:
> > > > Hello,
> > > > This BUG happened while pktcdvd service was starting. Basically, the
> > > > 2 following commands were issued:
> > > > - modprobe ptkcdvd
> > > > - pktsetup dvd /dev/dvd
> > >
> > > This appears to be a real bug:
> > >
> > > A normal pkt dvd block dev open takes the
> > > bdev_mutex in the regular block device open path, which takes
> > > ctl_mutex in the pkt_open function which gets called then from
> > > the block layer.
> > >
> > > HOWEVER the IOCTL path does it the other way around:
> > >
> > > mutex_lock(&ctl_mutex);
> > > ret = pkt_setup_dev(&ctrl_cmd);
> > > mutex_unlock(&ctl_mutex);
> > >
> > > where pkt_setup_dev in term calls pkt_new_dev which
> > > calls blkdev_get(), which takes the bdev_mutex.
> > >
> > > Looks very much like a AB-BA deadlock to me...
> >
> > I don't understand how this could deadlock. If the device is already
> > setup, pkt_new_dev() returns before calling blkdev_get(). If the
> > device is not already setup, the block device doesn't exist yet so
> > there can not be another caller in the pkt_open() path.
>
> and what locking prevents this? And via multiple opens?

You are right that my reasoning was incorrect. If someone is doing
"pktsetup ; pktsetup -d" quickly in a loop while someone else is
trying to open the device, one thread could be at the start of
pkt_open() at the same time as another thread is in pkt_new_dev().

However, I added a 5s delay in pkt_open() to enlarge the race window.
I still couldn't make the driver lock up though. The explanation is
that pkt_new_dev() calls blkdev_get() with the CD device (eg /dev/hdc)
as bdev parameter, while do_open() locks the bd_mutex for the pktcdvd
device (eg /dev/pktcdvd/0).

Do you still think this could deadlock? If not, how should the code be
annotated to make this warning go away?

--
Peter Osterlund - petero2@xxxxxxxxx
http://web.telia.com/~u89404340
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/