Re: debugfs vs. device removal

From: Omar Sandoval
Date: Thu Jan 19 2017 - 14:47:16 EST


On Thu, Jan 19, 2017 at 07:03:52PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 19, 2017 at 09:33:50AM -0800, Omar Sandoval wrote:
> > On Thu, Jan 19, 2017 at 05:03:48PM +0100, Jiri Kosina wrote:
> > > On Thu, 19 Jan 2017, Greg Kroah-Hartman wrote:
> > >
> > > > > In the block layer, we abuse sysfs to export some per-device debugging
> > > > > information. I was looking into moving this to debugfs, but I realized
> > > > > that debugfs doesn't have a mechanism to ensure that a file associated
> > > > > with a device is safe to use when the device is removed.
> > > >
> > > > What do you mean by "safe"? The race conditions where you remove a file
> > > > and still have it open should all now be resolved in 4.8 and 4.9, di dwe
> > > > miss something?
> > >
> > > This is something else -- Omar is right, hid-debugfs interface is buggy.
> > > It basically doesn't synchronize the data dumping with device removal, so
> > > if device is removed and deallocated and the race is hit, it tries to
> > > dereference struct hid_device which has already been freed.
> >
> > Yup, I'm talking about the case where I create a debugfs file and the
> > data pointer is, say, a struct request_queue. If userspace calls open()
> > on a debugfs file, then the device goes away, the struct request_queue
> > is going to get freed and read() will blow up.
> >
> > If we're talking about objects with a struct kobject (like struct
> > request_queue), can we just grab an extra reference in open() and drop
> > it in release()? This allows userspace to keep stuff pinned
> > indefinitely, but debugfs is root-only and the use-case is usually just
> > `cat`.
>
> Again, debugfs got a bunch of changes in the 4.8 and 4.9 timeframe to
> resolve this issue. Try it and see with just a "normal" debugfs file
> and see how it works.

The change in this area that I see is 49d200deaa68 ("debugfs: prevent
access to removed files' private data"). That went in for 4.7. I'm
pretty confused now since I can't reproduce the oops anymore on either
4.8 or 4.10-rc4. If I see it again I'll be sure to report it, but it
seems like debugfs should just work for what I need. Thanks for the
help, Greg.