Re: [PATCH] fs: Fix mod_timer crash when removing USB sticks

From: Theodore Tso
Date: Fri Mar 16 2012 - 17:10:24 EST


I thought another fix at the USB layer also went in that attempted to
fix this problem for 3.2, and so with two separate band-aid patches, I
think we had thought the problem had been addressed.

The real problem is that all of the patches which I've seen to date
are band-aids, in that we aren't properly sending a "device as
disappeared" notification to the file system layer, but instead we are
trying to keep enough of the pointers valid (while also freeing other
data structures), such that the file system can blindly write into a
partially dismantled block device, and hopefully not oops.

Some have argued that my suggested approach of having an explicit
super_ops revoke() function, which tells the file system that the
block device is gone, etc., isn't necessary because this can be solved
in userspace somehow. Personally I think that's nuts, since we'll
continue to play whack-a-mole, but I haven't had time to work up
patches addressing this --- since this is really only a problem for
naive users who pull USB sticks without unmounting them first (and so
it never happens to me :-), and I've got a lot of other fish to
try.....

-- Ted

On Fri, Mar 16, 2012 at 3:43 PM, Greg KH <greg@xxxxxxxxx> wrote:
>
> On Fri, Mar 16, 2012 at 12:29:15PM -0700, Paul Taysom wrote:
> > On Fri, Mar 16, 2012 at 10:36 AM, Greg KH <greg@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jan 12, 2012 at 01:57:11PM -0800, Paul Taysom wrote:
> > > > A USB stick with a ext file system on it, would occasionally crash
> > > > when the stick was pulled.
> > > >
> > > > The problem was a timer was being set on the Backing Device
> > > > Interface,
> > > > bdi, after the USB device had been removed and the bdi had been
> > > > unregistered. The bdi would then be later reinitialized by zeroing
> > > > the timer without removing from the timer from the timer queue.
> > > > This would eventually result in a kernel crash (NULL ptr
> > > > dereference).
> > > >
> > > > When the bdi is unregistered, the dev field is set to NULL. This
> > > > indication is used by bdi_unregister to only unregister the device
> > > > once.
> > > >
> > > > Fix: When the backing device is invalidated, the mapping
> > > > backing_dev_info
> > > > should be redirected to the default_backing_dev_info.
> > > >
> > > > Created 3 USB sticks with ext2, ext4 and one with both apple and DOS
> > > > file systems on it. Inserted and removed USB sticks many times in
> > > > random
> > > > order. With out the bug fix, the kernel would soon crash. With the
> > > > fix,
> > > > it did not. Ran on both stumpy and amd64-generic.
> > > >
> > > > Signed-off-by: Paul Taysom <taysom@xxxxxxxxxxxx>
> > > > Cc: Mandeep Baines <msb@xxxxxxxxxxxx>
> > > > Cc: Greg KH <greg@xxxxxxxxx>
> > > > Cc: Jens Axboe <axboe@xxxxxxxxx>
> > > > Cc: Theodore Tso <tytso@xxxxxxxxxx>
> > > > Cc: Andrew Morton <akpm@xxxxxxxxxx>
> > > > Cc: <linux-usb@xxxxxxxxxxxxxxx>
> > > > Cc: <linux-kernel@xxxxxxxxxxxxxxx>
> > > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
> > > > Cc: <linux-fsdevel@xxxxxxxxxxxxxxx>
> > > > Cc: <stable@xxxxxxxxxx>
> > > > ---
> > > >  fs/block_dev.c |    1 +
> > > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > > index afe74dd..322cd05 100644
> > > > --- a/fs/block_dev.c
> > > > +++ b/fs/block_dev.c
> > > > @@ -110,6 +110,7 @@ void invalidate_bdev(struct block_device *bdev)
> > > >        * But, for the strange corners, lets be cautious
> > > >        */
> > > >       cleancache_flush_inode(mapping);
> > > > +     mapping->backing_dev_info = &default_backing_dev_info;
> > > >  }
> > > >  EXPORT_SYMBOL(invalidate_bdev);
> > >
> > > What ever happened to this patch?  Is it still needed?  Can you still
> > > reproduce the problem on Linus's tree and older kernels?
> > >
> >
> >
> > Never heard anything back.  Ted supplied a partial fix in 3.2.6 (I
> > believe) for just the ext4 file system. Who should I follow up with?
>
> If the fix went into the 3.2-stable tree, then it's in Linus's tree
> already, which is good.
>
> But, what about all of the other filesystems you hit this on, do we need
> to make the same change to all of them?  If so, that kind of implies
> your original patch is the correct one :)
>
> As for who to poke, Ted, Al, Jens, what should we do here?
>
> thanks,
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/