[PATCH 5.4 177/309] btrfs: flush write bio if we loop in extent_write_cache_pages

From: Greg Kroah-Hartman
Date: Mon Feb 10 2020 - 08:15:47 EST

Next message: Greg Kroah-Hartman: "[PATCH 5.4 203/309] KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacks"
Previous message: Greg Kroah-Hartman: "[PATCH 5.4 030/309] srcu: Apply *_ONCE() to ->srcu_last_gp_end"
In reply to: Greg Kroah-Hartman: "[PATCH 5.4 030/309] srcu: Apply *_ONCE() to ->srcu_last_gp_end"
Next in thread: Greg Kroah-Hartman: "[PATCH 5.4 203/309] KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Josef Bacik <josef@xxxxxxxxxxxxxx>

commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f upstream.

There exists a deadlock with range_cyclic that has existed forever. If
we loop around with a bio already built we could deadlock with a writer
who has the page locked that we're attempting to write but is waiting on
a page in our bio to be written out. The task traces are as follows

PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5"
#0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
#3 [ffffc900297bb708] __lock_page at ffffffff811f145b
#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
#9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd

PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND:
"aio-dio-invalid"
#0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
#9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032

I used drgn to find the respective pages we were stuck on

page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874

As you can see the kworker is waiting for bit 0 (PG_locked) on index
7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
8148. aio-dio-invalid has 7680, and the kworker epd looks like the
following

crash> struct extent_page_data ffffc900297bbbb0
struct extent_page_data {
bio = 0xffff889f747ed830,
tree = 0xffff889eed6ba448,
extent_locked = 0,
sync_io = 0
}

Probably worth mentioning as well that it waits for writeback of the
page to complete while holding a lock on it (at prepare_pages()).

Using drgn I walked the bio pages looking for page
0xffffea00fbfc7500 which is the one we're waiting for writeback on

bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
for i in range(0, bio.bi_vcnt.value_()):
bv = bio.bi_io_vec[i]
if bv.bv_page.value_() == 0xffffea00fbfc7500:
print("FOUND IT")

which validated what I suspected.

The fix for this is simple, flush the epd before we loop back around to
the beginning of the file during writeout.

Fixes: b293f02e1423 ("Btrfs: Add writepages support")
CC: stable@xxxxxxxxxxxxxxx # 4.4+
Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx>
Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
Signed-off-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
fs/btrfs/extent_io.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4185,7 +4185,16 @@ retry:
*/
scanned = 1;
index = 0;
- goto retry;
+
+ /*
+ * If we're looping we could run into a page that is locked by a
+ * writer and that writer could be waiting on writeback for a
+ * page in our current bio, and thus deadlock, so flush the
+ * write bio here.
+ */
+ ret = flush_write_bio(epd);
+ if (!ret)
+ goto retry;
}

if (wbc->range_cyclic || (wbc->nr_to_write > 0 && range_whole))

Next message: Greg Kroah-Hartman: "[PATCH 5.4 203/309] KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacks"
Previous message: Greg Kroah-Hartman: "[PATCH 5.4 030/309] srcu: Apply *_ONCE() to ->srcu_last_gp_end"
In reply to: Greg Kroah-Hartman: "[PATCH 5.4 030/309] srcu: Apply *_ONCE() to ->srcu_last_gp_end"
Next in thread: Greg Kroah-Hartman: "[PATCH 5.4 203/309] KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]