On 01/28, Chao Yu wrote:
On 2022/1/28 5:59, Jaegeuk Kim wrote:
On 01/27, Chao Yu wrote:
Quoted from Jing Xia's report, there is a potential deadlock may happen
between kworker and checkpoint as below:
[T:writeback] [T:checkpoint]
- wb_writeback
- blk_start_plug
bio contains NodeA was plugged in writeback threads
I'm still trying to understand more precisely. So, how is it possible to
have bio having node write in this current context?
IMO, after above blk_start_plug(), it may plug some inode's node page in kworker
during writebacking node_inode's data page (which should be node page)?
Wasn't that added into a different task->plug?
Thanks,
- do_writepages -- sync write inodeB, inc wb_sync_req[DATA]
- f2fs_write_data_pages
- f2fs_write_single_data_page -- write last dirty page
- f2fs_do_write_data_page
- set_page_writeback -- clear page dirty flag and
PAGECACHE_TAG_DIRTY tag in radix tree
- f2fs_outplace_write_data
- f2fs_update_data_blkaddr
- f2fs_wait_on_page_writeback -- wait NodeA to writeback here
- inode_dec_dirty_pages
- writeback_sb_inodes
- writeback_single_inode
- do_writepages
- f2fs_write_data_pages -- skip writepages due to wb_sync_req[DATA]
- wbc->pages_skipped += get_dirty_pages() -- PAGECACHE_TAG_DIRTY is not set but get_dirty_pages() returns one
- requeue_inode -- requeue inode to wb->b_dirty queue due to non-zero.pages_skipped
- blk_finish_plug
Let's try to avoid deadlock condition by forcing unplugging previous bio via
blk_finish_plug(current->plug) once we'v skipped writeback in writepages()
due to valid sbi->wb_sync_req[DATA/NODE].
Fixes: 687de7f1010c ("f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE")
Signed-off-by: Zhiguo Niu <zhiguo.niu@xxxxxxxxxx>
Signed-off-by: Jing Xia <jing.xia@xxxxxxxxxx>
Signed-off-by: Chao Yu <chao@xxxxxxxxxx>
---
fs/f2fs/data.c | 6 +++++-
fs/f2fs/node.c | 6 +++++-
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 76d6fe7b0c8f..932a4c81acaf 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3174,8 +3174,12 @@ static int __f2fs_write_data_pages(struct address_space *mapping,
/* to avoid spliting IOs due to mixed WB_SYNC_ALL and WB_SYNC_NONE */
if (wbc->sync_mode == WB_SYNC_ALL)
atomic_inc(&sbi->wb_sync_req[DATA]);
- else if (atomic_read(&sbi->wb_sync_req[DATA]))
+ else if (atomic_read(&sbi->wb_sync_req[DATA])) {
+ /* to avoid potential deadlock */
+ if (current->plug)
+ blk_finish_plug(current->plug);
goto skip_write;
+ }
if (__should_serialize_io(inode, wbc)) {
mutex_lock(&sbi->writepages);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 556fcd8457f3..69c6bcaf5aae 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -2106,8 +2106,12 @@ static int f2fs_write_node_pages(struct address_space *mapping,
if (wbc->sync_mode == WB_SYNC_ALL)
atomic_inc(&sbi->wb_sync_req[NODE]);
- else if (atomic_read(&sbi->wb_sync_req[NODE]))
+ else if (atomic_read(&sbi->wb_sync_req[NODE])) {
+ /* to avoid potential deadlock */
+ if (current->plug)
+ blk_finish_plug(current->plug);
goto skip_write;
+ }
trace_f2fs_writepages(mapping->host, wbc, NODE);
--
2.32.0