[PATCH RESEND 1/3 v2.6.39-rc7] block: don't use non-syncing eventblocking in disk_check_events()

From: Tejun Heo
Date: Tue May 17 2011 - 06:27:23 EST

This patch is part of fix for triggering of WARN_ON_ONCE() in
disk_clear_events() reported in bug#34662.


disk_clear_events() blocks events, schedules and flushes the event
work. It expects the work to have started execution on schedule and
finished on return from flush. WARN_ON_ONCE() triggers if the event
work hasn't executed as expected. This problem happens because
__disk_block_events() fails to guarantee that the event work item is
not in flight on return from the function in race-free manner. The
problem is two-fold and this patch addresses one of them.

When __disk_block_events() is called with @sync == %false, it bumps
event block count, calls cancel_delayed_work() and return. This makes
it impossible to guarantee that event polling is not in flight on
return from syncing __disk_block_events() - if the first blocker was
non-syncing, polling could still be in progress and later syncing ones
would assume that the first blocker already canceled it.

Making __disk_block_events() cancel_sync regardless of block count
isn't feasible either as it may race with forced event checking in

As disk_check_events() is the only user of non-syncing
__disk_block_events(), updating it to directly cancel and schedule
event work is the easiest way to solve the issue.

Note that there's another bug in __disk_block_events() and this patch
doesn't fix the issue completely. Later patch will fix the other bug.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Tested-by: Sitsofe Wheeler <sitsofe@xxxxxxxxx>
Reported-by: Sitsofe Wheeler <sitsofe@xxxxxxxxx>
Reported-by: Borislav Petkov <bp@xxxxxxxxx>
Reported-by: Meelis Roos <mroos@xxxxxxxx>
Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: Kay Sievers <kay.sievers@xxxxxxxx>
(sorry, forgot to cc lkml, resending)

This is the first of three patches which (finally) fix the
WARN_ON_ONCE() in disk_clear_events() triggering. It was me being
stupid about synchronization around event blocking.

Given that we're very late in -rc cycle and, although the fix isn't
invasive, it isn't obvious one-liner either, and that the bug happens
sporadically with non-critical failure mode, it might be better to
route this through block for v2.6.40-rc1 and then back port to v2.6.39
via -stable, unless v2.6.39 is gonna go through another -rc cycle.

Jens, Linus, what do you guys think?

Thank you.

block/genhd.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

Index: work/block/genhd.c
--- work.orig/block/genhd.c
+++ work/block/genhd.c
@@ -1508,10 +1508,18 @@ void disk_unblock_events(struct gendisk
void disk_check_events(struct gendisk *disk)
- if (disk->ev) {
- __disk_block_events(disk, false);
- __disk_unblock_events(disk, true);
+ struct disk_events *ev = disk->ev;
+ unsigned long flags;
+ if (!ev)
+ return;
+ spin_lock_irqsave(&ev->lock, flags);
+ if (!ev->block) {
+ cancel_delayed_work(&ev->dwork);
+ queue_delayed_work(system_nrt_wq, &ev->dwork, 0);
+ spin_unlock_irqrestore(&ev->lock, flags);

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/