[PATCH RESEND 2/3 v2.6.39-rc7] block: make disk_block_events()properly wait for work cancellation

From: Tejun Heo
Date: Tue May 17 2011 - 06:29:02 EST


disk_block_events() should guarantee that the event work is not in
flight on return and once blocked it shouldn't issue further
cancellations.

Because there was no synchronization between the first blocker doing
cancel_delayed_work_sync() and the following blockers, the following
blockers could finish before cancellation was complete, which broke
both guarantees - event work could be in flight and cancellation could
happen after return.

This bug triggered WARN_ON_ONCE() in disk_clear_events() reported in
bug#34662.

https://bugzilla.kernel.org/show_bug.cgi?id=34662

Fix it by introducing DISK_EVENT_CANCELING bit which is set by the
first blocker while cancellation is in progress. Further blockers
wait until the bit is cleared by the first blocker.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Tested-by: Sitsofe Wheeler <sitsofe@xxxxxxxxx>
Reported-by: Sitsofe Wheeler <sitsofe@xxxxxxxxx>
Reported-by: Borislav Petkov <bp@xxxxxxxxx>
Reported-by: Meelis Roos <mroos@xxxxxxxx>
Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: Kay Sievers <kay.sievers@xxxxxxxx>
---
block/genhd.c | 36 +++++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)

Index: work/block/genhd.c
===================================================================
--- work.orig/block/genhd.c
+++ work/block/genhd.c
@@ -1371,7 +1371,7 @@ struct disk_events {
struct gendisk *disk; /* the associated disk */
spinlock_t lock;

- int block; /* event blocking depth */
+ unsigned int block; /* event blocking depth */
unsigned int pending; /* events already sent out */
unsigned int clearing; /* events being cleared */

@@ -1379,6 +1379,8 @@ struct disk_events {
struct delayed_work dwork;
};

+#define DISK_EVENT_CANCELING 0x80000000U
+
static const char *disk_events_strs[] = {
[ilog2(DISK_EVENT_MEDIA_CHANGE)] = "media_change",
[ilog2(DISK_EVENT_EJECT_REQUEST)] = "eject_request",
@@ -1414,6 +1416,12 @@ static unsigned long disk_events_poll_ji
return msecs_to_jiffies(intv_msecs);
}

+static int disk_block_wait_canceling(void *word)
+{
+ schedule();
+ return 0;
+}
+
/**
* disk_block_events - block and flush disk event checking
* @disk: disk to block events for
@@ -1438,12 +1446,34 @@ void disk_block_events(struct gendisk *d
if (!ev)
return;

+ /*
+ * Bump block count and set CANCELLING if we're the first blocker
+ * and have to cancel the event work.
+ */
spin_lock_irqsave(&ev->lock, flags);
- cancel = !ev->block++;
+ if ((cancel = !ev->block++))
+ ev->block |= DISK_EVENT_CANCELING;
spin_unlock_irqrestore(&ev->lock, flags);

- if (cancel)
+ if (cancel) {
+ /*
+ * Cancel the event work, clear CANCELING and wake up
+ * waiters.
+ */
cancel_delayed_work_sync(&disk->ev->dwork);
+
+ spin_lock_irqsave(&ev->lock, flags);
+ ev->block &= ~DISK_EVENT_CANCELING;
+ spin_unlock_irqrestore(&ev->lock, flags);
+ wake_up_bit(&ev->block, ilog2(DISK_EVENT_CANCELING));
+ } else {
+ /*
+ * The first blocker might not have finished canceling the
+ * event work. Wait for CANCELING to clear.
+ */
+ wait_on_bit(&ev->block, ilog2(DISK_EVENT_CANCELING),
+ disk_block_wait_canceling, TASK_UNINTERRUPTIBLE);
+ }
}

static void __disk_unblock_events(struct gendisk *disk, bool check_now)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/