Re: Oops/Warning report for the week of March 28th 2008

From: Björn Steinbrink
Date: Sat Mar 29 2008 - 08:20:33 EST


On 2008.03.28 17:16:42 -0400, Dmitry Torokhov wrote:
> On Fri, Mar 28, 2008 at 01:51:38PM -0700, Linus Torvalds wrote:
> >
> >
> > On Fri, 28 Mar 2008, Linus Torvalds wrote:
> > >
> > > Is there something obvious that I'm missing? I'd really like to see the
> > > whole posting that the oops came from. Do you save the originals or even
> > > just message ID's from the ones you pick from emails?
> >
> > Hmm. Definitely not from the kernel mailing list. I'm intrigued, where did
> > that oops #5814 come from (picked a recent one at random)?
> >
> > The thing is recent, and oopses on "mutex_lock(dev->mutex)" in
> > input_release_device. In particular, the path *seems* to be this one:
> >
> > evdev_release ->
> > evdev_ungrab ->
> > input_release_device ->
> > mutex_lock ->
> > mutex_lock_nested ->
> > __mutex_lock_common ->
> > list_add_tail(&waiter.list, &lock->wait_list)
> >
> > where "lock->wait_list.prev" seems to be 0x6b6b6b6b6b6b6b6b, which is the
> > use-after-free poison pattern.
> >
> > (In fact, I think the access that actually oopses is when the
> > debug version of __list_add() does
> >
> > if (unlikely(prev->next != next)) {
> >
> > because that "prev" pointer is crap).
> >
> > So it seems that when input_release_device() does:
> >
> > struct input_dev *dev = handle->dev;
> >
> > mutex_lock(&dev->mutex);
> >
> > the "dev" it uses has already been released. And this only shows up as a
> > problem when you have slab debugging turned on (like the Fedora kernels
> > do, thank you all Fedora guys).
> >
> > The odd thing is that I don't think any of this code has really changed
> > recently.
> >
>
> There is a patch from Pete that works around the problem by not
> calling input_release_device() on devices that are gone. But what
> I don't understand is why the parent input device is gone since
> sysfs/driver core should be keeping a reference to it since it is
> a parent of evdev. input_dev shoudl only be released after
> evdev_free() is called.

Hm? evdev_free only does the final kfree call. The calls to device_del
and put_device are already happening in device_disconnect, so the parent
can go away any time after that. Do you say that that should be moved
into evdev_free instead? I'm not familiar with the code, but at first
sight, I'd say that we should have a "if (evdev->grab)
evdev_ungrab(evdev, evdev->grab)" in evdev_cleanup, looks like the
logical place to do that. Anything I'm missing?

Björn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/