On Mon, 2021-11-01 at 15:12 +0800, Chao Yu wrote:
On 2021/11/1 15:09, Hyeong-Jun Kim wrote:- Thread A - Thread B
On Mon, 2021-11-01 at 14:28 +0800, Chao Yu wrote:
On 2021/11/1 13:42, Hyeong-Jun Kim wrote:
Encrypted pages during GC are read and cached in META_MAPPING.
However, due to cached pages in META_MAPPING, there is an issue
where
newly written pages are lost by IPU or DIO writes.
Thread A Thread B
- f2fs_gc(): blk 0x10 -> 0x20 (a)
- IPU or DIO write on
blk
0x20 (b)
- f2fs_gc(): blk 0x20 -> 0x30 (c)
(a) page for blk 0x20 is cached in META_MAPPING and page for
blk
0x10
is invalidated from META_MAPPING.
(b) write new data to blk 0x200 using IPU or DIO, but outdated
data
still remains in META_MAPPING.
(c) f2fs_gc() try to move blk from 0x20 to 0x30 using cached
page
in
META_MAPPING. In conclusion, the newly written data in
(b) is
lost.
In c), f2fs_gc() will readahead encrypted block from disk via
ra_data_block() anyway,
not matter cached encrypted page of meta inode is uptodate or
not, so
it's safe, right?
Right,
However, if DIO write is performed between phase 3 and phase 4 of
f2fs_gc(),
the cached page of meta_mapping will be out-dated, though it read
data
from
disk via ra_data_block() in phase 3.
What do you think?
Due to i_gc_rwsem lock coverage, the race condition should not happen
right now?
/* phase 3 */
down_write(i_gc_rwsem)
ra_data_block()
up_write(i_gc_rwsem)
f2fs_direct_IO() :
down_read(i_gc_rwsem)
__blockdev_direct_IO()
...
get_ddata_block_dio_write()
...
f2fs_dio_submit_bio()
up_read(i_gc_rwsem)
/* phase 4 */
down_write(i_gc_rwsem)
move_data_block()
up_write(i_gc_rwsem)
It looks, i_gc_rwsem could not protect page update between phase 3 and
4.
Am I missing anything?
Thanks
Thanks,
Thanks,
Am I missing anything?
Thanks,
To address this issue, invalidating pages in META_MAPPING
before
IPU or
DIO write.
Signed-off-by: Hyeong-Jun Kim <
hj514.kim@xxxxxxxxxxx
---
fs/f2fs/data.c | 2 ++
fs/f2fs/segment.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 74e1a350c1d8..9f754aaef558 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1708,6 +1708,8 @@ int f2fs_map_blocks(struct inode *inode,
struct f2fs_map_blocks *map,
*/
f2fs_wait_on_block_writeback_range(inode,
map->m_pblk,
map-
m_len);
+ invalidate_mapping_pages(META_MAPPING(sbi),
+ map->m_pblk,
map-
m_pblk);
if (map->m_multidev_dio) {
block_t blk_addr = map->m_pblk;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 526423fe84ce..f57c55190f9e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3652,6 +3652,9 @@ int f2fs_inplace_write_data(struct
f2fs_io_info *fio)
goto drop_bio;
}
+ invalidate_mapping_pages(META_MAPPING(fio->sbi),
+ fio->new_blkaddr, fio-
new_blkaddr);+
stat_inc_inplace_blocks(fio->sbi);
if (fio->bio && !(SM_I(sbi)->ipu_policy & (1 <<
F2FS_IPU_NOCACHE)))