Data corruption problem with swapfiles and THP
From: Matthew Wilcox
Date: Thu Aug 12 2021 - 11:09:10 EST
There is an assumption in the swap writepage path that a THP is physically
contiguous on swap:
bio->bi_iter.bi_sector = swap_page_sector(page);
bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc);
bio->bi_end_io = end_write_func;
bio_add_page(bio, page, thp_size(page), 0);
As far as I can tell, this is not necessarily true. If a file is not
contiguous, we can have an extent which is 1MB long followed by an extent
somewhere else on storage that's 1MB long. When we try to write a 2MB
page to swap, we overwrite whatever's on the block device after that
first 1MB extent.
(Came across this by code examination while looking at getting rid of
the bio path entirely; no attempt has been made to produce this problem;
something else may prevent it from actually happening)