Yes, though this is the long route -- you get to define a new ondisk logIOWs, although the*disk write* was completed successfully, the mappingcould we make it logged for those special cases which we are interested in
updates were torn, and the user program sees a torn write.
The most performant/painful way to fix this would be to make the wholeioend completion a logged operation so that we could commit to updating
all the unwritten mappings and restart it after a crash.
only?
item, build all the incore structures to process them, and then define a
new high level operation that uses the state encoded in that new log
item to run all the ioend completion transactions within that framework.
Also you get to add a new log incompat feature bit for this.
Perhaps we should analyze the cost of writing and QA'ing all that vs.
the amount of time saved in the handling of this corner case using one
of the less exciting options.
> >>> before handing the one single
Yes, I'm aware.The least performant of course is to write zeroes at allocation time,That idea was already proposed:
like we do for fsdax.
https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/ ZcGIPlNCkL6EDx3Z@xxxxxxxxxxxxxxxxxxx/__;!!ACWV5N9M2RV99hQ! Kmx2Rrrot3GTqBS3kwhTi1nxIrpiPDyiy3TEfowsRKonvY90W7o4xUv9r9seOfDAMa2gT- TCNVlpH-CGXA$
Only if you set the forcealign size to > 1fsb and fail to write newA possible middle ground would be to detect IOMAP_ATOMIC in theWon't that have the same issue as using XFS_BMAPI_ZERO, above i.e. zeroing
->iomap_begin method, notice that there are mixed mappings under the
proposed untorn IO, and pre-convert the unwritten blocks by writing
zeroes to disk and updating the mappings
during allocation?
file data in forcealign units, even for non-untorn writes. If all
writes to the file are aligned to the forcealign size then there's only
one extent conversion to be done, and that cannot be torn.
xfs_iomap_write_unwritten is in the ioend path; that's not what I wasmapping back to iomap_dio_rw to stage the untorn writes bio. At leastBTW, one issue I have with the sub-extent(or -alloc unit) zeroing from v4
you'd only be suffering that penalty for the (probable) corner case of
someone creating mixed mappings.
series is how the unwritten conversion has changed, like:
xfs_iomap_write_unwritten()
{
unsigned int rounding;
/* when converting anything unwritten, we must be spanning an alloc unit,
so round up/down */
if (rounding > 1) {
offset_fsb = rounddown(rounding);
count_fsb = roundup(rounding);
}
...
do {
xfs_bmapi_write();
...
xfs_trans_commit();
} while ();
}
I'm not too happy with it and it seems a bit of a bodge, as I would rather
we report the complete size written (user data and zeroes); then
xfs_iomap_write_unwritten() would do proper individual block conversion.
However, we do something similar for zeroing for sub-FSB writes. I am not
sure if that is the same thing really, as we only round up to FSB size.
Opinion?
talking about.
I'm talking about a separate change to the xfs_direct_write_iomap_begin
function that would detect the case where the bmapi_read returns an
@imap that doesn't span the whole forcealign region, then repeatedly
calls bmapi_write(BMAPI_ZERO | BMAPI_CONVERT) on any unwritten mappings
within that file range until the original bmapi_read would return a
single written mapping.