Re: [PATCH v4] block/loop: Serialize ioctl operations.
From: Jan Kara
Date: Mon Sep 24 2018 - 12:31:20 EST
On Mon 24-09-18 22:05:20, Tetsuo Handa wrote:
> On 2018/09/24 21:31, Jan Kara wrote:
> > On Mon 24-09-18 19:29:10, Tetsuo Handa wrote:
> >> On 2018/09/24 7:03, Ming Lei wrote:
> >>> On Sat, Sep 22, 2018 at 09:39:02PM +0900, Tetsuo Handa wrote:
> >>>> Hello, Ming Lei.
> >>>>
> >>>> I'd like to hear your comment on this patch regarding the ordering of
> >>>> stopping kernel thread.
> >>>>
> >>>> > In order to enforce this strategy, this patch inversed
> >>>> > loop_reread_partitions() and loop_unprepare_queue() in loop_clr_fd().
> >>>> > I don't know whether it breaks something, but I don't have testcases.
> >>>>
> >>>> Until 3.19, kthread_stop(lo->lo_thread) was called before
> >>>> ioctl_by_bdev(bdev, BLKRRPART, 0) is called.
> >>>> During 4.0 to 4.3, the loop module was using "kloopd" workqueue.
> >>>> But since 4.4, loop_reread_partitions(lo, bdev) is called before
> >>>> loop_unprepare_queue(lo) is called. And this patch is trying to change to
> >>>> call loop_unprepare_queue() before loop_reread_partitions() is called.
> >>>> Is there some reason we need to preserve current ordering?
> >>>
> >>> IMO, both the two orders are fine, and what matters is that 'lo->lo_state'
> >>> is updated before loop_reread_partitions(), then any IO from loop_reread_partitions
> >>> will be failed, so it shouldn't be a big deal wrt. the order between
> >>> loop_reread_partitions() and loop_unprepare_queue().
> >>
> >> OK. Thank you. Here is v4 patch (only changelog was updated).
> >> Andrew, can we test this patch in the -mm tree?
> >>
> >> From 2278250ac8c5b912f7eb7af55e36ed40e2f7116b Mon Sep 17 00:00:00 2001
> >> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> >> Date: Mon, 24 Sep 2018 18:58:37 +0900
> >> Subject: [PATCH v4] block/loop: Serialize ioctl operations.
> >>
> >> syzbot is reporting NULL pointer dereference [1] which is caused by
> >> race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
> >> ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
> >> loop devices without holding corresponding locks.
> >>
> >> syzbot is also reporting circular locking dependency between bdev->bd_mutex
> >> and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part()
> >> with lock held.
> >
> > Thanks for looking into the loop crashes Tetsuo. I was looking into the
> > loop code and trying to understand how your patch fixes them but I've
> > failed. Can you please elaborate a bit on how exactly LOOP_CLR_FD and
> > LOOP_SET_FD race to hit NULL pointer dereference? I don't really see the
> > code traversing other loop devices as you mention in your changelog so I'm
> > probably missing something. Thanks.
> >
>
> That is explained in a discussion for [1] at
> https://groups.google.com/forum/#!msg/syzkaller-bugs/c8KUcTAzTvA/3o_7g6-tAwAJ
> . In the current code, the location of dangerous traversal is in
> loop_validate_file().
OK, thanks for explanation! I'll send some comments in reply to your patch.
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR