Re: blk: NULL ptr deref in blk_dequeue_request()

From: Ben Hutchings
Date: Wed Oct 10 2012 - 11:53:02 EST


On Tue, 2012-10-09 at 09:26 -0400, Sasha Levin wrote:
> On 10/09/2012 09:21 AM, Sasha Levin wrote:
> > On 10/08/2012 05:45 PM, Jiri Kosina wrote:
> >> On Mon, 8 Oct 2012, Jan Kara wrote:
> >>
> >>>>>> I'm still seeing this on linux-next.
> >>>> I think this is floppy related (see redo_fd_request() in the stack
> >>>> trace). And there were quite some changes to the area recently. Adding
> >>>> maintainer to CC.
> >> Hmm ... I don't immediately see how this is happening.
> >>
> >> Sasha, could you please do git bisect on drivers/block/floppy.c between
> >> f6365201d and your git HEAD for starters (assuming that f6365201d works
> >> well for you?).
> >>
> >
> > A bisect on floppy.c yielded the following:
> >
> > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> > Author: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
> > Date: Mon Aug 27 20:56:53 2012 -0300
> >
> > genhd: Make put_disk() safe for disks that have not been registered
>
> 2 more things:
>
> 1. The guest vm which I'm testing on doesn't emulate anything which even looks like a floppy.
> 2. I'm seeing the following lines before the BUG:
>
> [ 9.836604] floppy0: no floppy controllers found
> [ 9.837246] work still pending
> [ 9.837743] floppy0: floppy_shutdown: timeout handler died.

I see two problems:

1. redo_fd_request() races with tear-down of the disks, but because
set_next_request() checks disk->queue before doing anything this was
usually harmless. Now that do_floppy_init() doesn't clear disk->queue,
the race condition is much easier to hit. This may fix that problem in
do_floppy_init(), though there appear to be worse bugs in tear-down
order in floppy_module_exit():

--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4320,13 +4320,13 @@ out_unreg_region:
out_unreg_blkdev:
unregister_blkdev(FLOPPY_MAJOR, "fd");
out_put_disk:
+ destroy_workqueue(floppy_wq);
while (dr--) {
del_timer_sync(&motor_off_timer[dr]);
if (disks[dr]->queue)
blk_cleanup_queue(disks[dr]->queue);
put_disk(disks[dr]);
}
- destroy_workqueue(floppy_wq);
return err;
}

--- END ---

2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
cleared by del_gendisk(). Incremental patch below, but it should be
squashed into the previous patch if that branch is still rebase-able.

Ben.

---
From: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
Date: Wed, 10 Oct 2012 16:17:01 +0100
Subject: [PATCH] genhd: Make put_disk() safe again for disks that *have* been
registered

Commit b33d002 ('genhd: Make put_disk() safe for disks that have not
been registered') wrongly used the GENHD_FL_UP flag to test whether a
disk held a reference to its queue. Since this is cleared by
del_gendisk(), queues will not be properly cleaned up if a disk has
been registered and then torn down in the normal way. Introduce a
new flag for this purpose.

Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
block/genhd.c | 7 +++++--
include/linux/genhd.h | 1 +
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 633751d..b5f482f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -617,7 +617,10 @@ void add_disk(struct gendisk *disk)
* Take an extra ref on queue which will be put on disk_release()
* so that it sticks around as long as @disk is there.
*/
- WARN_ON_ONCE(!blk_get_queue(disk->queue));
+ if (blk_get_queue(disk->queue))
+ disk->flags |= GENHD_FL_GOT_QUEUE;
+ else
+ WARN_ON(1);

retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
"bdi");
@@ -1105,7 +1108,7 @@ static void disk_release(struct device *dev)
disk_replace_part_tbl(disk, NULL);
free_part_stats(&disk->part0);
free_part_info(&disk->part0);
- if (disk->queue && disk->flags & GENHD_FL_UP)
+ if (disk->queue && disk->flags & GENHD_FL_GOT_QUEUE)
blk_put_queue(disk->queue);
kfree(disk);
}
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 4f440b3..7c2560c 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -134,6 +134,7 @@ struct hd_struct {
#define GENHD_FL_NATIVE_CAPACITY 128
#define GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE 256
#define GENHD_FL_NO_PART_SCAN 512
+#define GENHD_FL_GOT_QUEUE 1024

enum {
DISK_EVENT_MEDIA_CHANGE = 1 << 0, /* media changed */


--
Ben Hutchings
Who are all these weirdos? - David Bowie, about L-Space IRC channel #afp

Attachment: signature.asc
Description: This is a digitally signed message part