Re: Multi-partition block layer behaviour
From: Shaohua Li
Date: Wed Oct 26 2011 - 01:42:27 EST
2011/10/26 Tiju Jacob <jacobtiju@xxxxxxxxx>:
> Hi All,
>
> We are trying to run fsstress tests on ext4 filesystem with
> linux-3.0.4 on nand flash with our proprietary driver. The test runs
> successfully when run on single partition but fails when run on
> multiple partitions with the bug "BUG: scheduling while atomic:
> fsstress.fork_n/498/0x00000002".
>
> Analysis:
>
> 1. When an I/O request is made to the filesystem, process 'A' acquires
> a mutex FS lock and a mutex block driver lock.
>
> 2. Process 'B' tries to acquire the mutex FS lock, which is not
> available. Hence, it goes to sleep. Due to the new plugging mechanism,
> before going to sleep, shcedule() is invoked which disables preemption
> and the context becomes atomic. In schedule(), the newly added
> blk_flush_plug_list() is invoked which unplugs the block driver.
>
> 3) During unplug operation the block driver tries to acquire the mutex
> lock which fails, because the lock was held by process 'A'. Previous
> invocation of scheudle() in step 2 has already made the context as
> atomic, hence the error "Schedule while atomic" occured.
if blk_flush_plug_list() is called in schedule(), it will use
blk_run_queue_async
to unplug the queue. This runs in a workqueue. So how could this happen?
Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/