Re: [PATCH 1/1] udf: Fix incorrect final NOT_ALLOCATED (hole) extent length

From: Steve Magnani
Date: Wed Jun 26 2019 - 22:46:53 EST


Hi Jan,

On 6/25/19 5:30 AM, Jan Kara wrote:
On Tue 04-06-19 07:31:58, Steve Magnani wrote:
In some cases, using the 'truncate' command to extend a UDF file results
in a mismatch between the length of the file's extents (specifically, due
to incorrect length of the final NOT_ALLOCATED extent) and the information
(file) length. The discrepancy can prevent other operating systems
(i.e., Windows 10) from opening the file.

Two particular errors have been observed when extending a file:

1. The final extent is larger than it should be, having been rounded up
to a multiple of the block size.

B. The final extent is not shorter than it should be, due to not having
been updated when the file's information length was increased.

The first case could represent a design error, if coded intentionally
due to a misinterpretation of scantily-documented ECMA-167 "file tail"
rules. The standard specifies that the tail, if present, consists of
a sequence of "unrecorded and allocated" extents (only).

Signed-off-by: Steven J. Magnani <steve@xxxxxxxxxxxxxxx>
Thanks for the testcase and the patch! I finally got to reading through
this in detail. In udf driver in Linux we are generally fine with the last
extent being rounded up to the block size. udf_truncate_tail_extent() is
generally responsible for truncating the last extent to appropriate size
once we are done with the inode. However there are two problems with this:

1) We used to do this inside udf_clear_inode() back in the old days but
then switched to a different scheme in commit 2c948b3f86e5f "udf: Avoid IO
in udf_clear_inode". So this actually breaks workloads where user calls
truncate(2) directly and there's no place where udf_truncate_tail_extent()
gets called.

2) udf_extend_file() sets i_lenExtents == i_size although the last extent
isn't properly rounded so even if udf_truncate_tail_extent() gets called
(which is actually the case for truncate(1) which does open, ftruncate,
close), it will think it has nothing to do and exit.

Now 2) is easily fixed by setting i_lenExtents to real length of extents we
have created. However that still leaves problem 1) which isn't easy to deal
with. After some though I think that your solution of making
udf_do_extend_file() always create appropriately sized extents makes
sense. However I dislike the calling convention you've chosen. When
udf_do_extend_file() needs to now byte length, then why not pass it to it
directly, instead of somewhat cumbersome "sector length + byte offset"
pair?

Will you update the patch please? Thanks!

That sounds reasonable, but at first glance I think it might be more confusing. The API as I reworked it now communicates two different (although related) things - the number of blocks that need to be added, and the number of bytes within the last block that are part of the file. This is able to cover both the corner case of extending within the last file block and extending beyond that:

partial_final_block = newsize & (sb->s_blocksize - 1);

/* File has extent covering the new size (could happen when extending
* inside a block)? */
if (etype == -1) {
if (partial_final_block)
offset++;
} else {
/* Extending file within the last file block */
offset = 0; /* Don't add any new blocks */
}

If it were as simple as passing to udf_do_extend_file() a loff_t specifying the number of bytes to add, including both full blocks and a final partial block, I would agree with you. But this isn't enough information for udf_do_extend_file() to know whether the final partial block requires a new block or not.

I will think about it some more. Maybe moving the 'extending within the last file block' case out to udf_extend_file() would help.

Steve